高通手机跑AI系列之——手部姿势跟踪

济曝喊 发表于 2025-8-13 15:18:23

（原创作者@CSDN_伊利丹~怒风）
环境准备

手机

测试手机型号：Redmi K60 Pro
处理器：第二代骁龙8移动--8gen2
运行内存：8.0GB ，LPDDR5X-8400，67.0 GB/s
摄像头：前置16MP+后置50MP+8MP+2MP
AI算力：NPU 48Tops INT8 && GPU 1536ALU x 2 x 680MHz = 2.089 TFLOPS
提示：任意手机均可以，性能越好的手机运行速度越快
软件

APP：AidLux2.0
系统环境：Ubuntu 20.04.3 LTS
提示：AidLux登录后代码运行更流畅，在代码运行时保持AidLux APP在前台运行，避免代码运行过程中被系统回收进程，另外屏幕保持常亮，一般息屏后一段时间，手机系统会进入休眠状态，如需长驻后台需要给APP权限。
算法Demo

代码功能分析

这段代码是一个基于 AI 的手部检测与识别程序，它通过摄像头实时捕捉画面，检测画面中的手部，并识别出手部的关键点位置，最终在画面上绘制出手部轮廓和关键点。
代码应用场景分析

这段代码实现了实时手部检测和关键点识别功能，可应用于多种场景：

[*]手势识别与交互系统

[*]智能家居控制：通过特定手势控制灯光、电器等设备
[*]虚拟现实 / 增强现实 (VR/AR)：手部动作作为交互输入，增强沉浸感
[*]游戏控制：替代传统控制器，实现更自然的游戏操作

[*]人机协作与机器人控制

[*]工业机器人引导：通过手势指挥机器人执行任务
[*]远程操作：操作人员通过手势控制远程设备

[*]医疗与康复领域

[*]手部运动康复训练：监测患者手部动作，评估康复进度
[*]手术辅助：医生通过手势操作医疗设备或查看影像资料

[*]教育与演示

[*]互动教学：教师通过手势控制教学内容展示
[*]演示系统：演讲者通过手势控制幻灯片或其他演示内容

[*]安防监控

[*]异常行为检测：分析人员手部动作识别潜在威胁行为
[*]身份验证：结合手部特征进行身份识别

AidLite 推理引擎功能

AidLite 推理引擎是专为边缘设备优化的轻量级 AI 推理引擎，具有以下核心功能：

[*]多框架支持

[*]支持 TensorFlow Lite、ONNX 等多种模型格式
[*]代码中使用了 TensorFlow Lite 格式的模型 (.tflite)

[*]硬件加速

[*]支持 GPU、CPU 、NPU等多种加速方式
[*]手掌检测模型使用 GPU 加速，提高实时性能
[*]关键点识别模型使用 CPU 加速，保证精度

[*]轻量化设计

[*]专为资源受限的边缘设备优化
[*]低内存占用，高效的模型推理能力

[*]易用的 API 接口

[*]提供模型加载、输入设置、推理执行、输出获取等完整流程的 API
[*]代码中通过aidlite.Model、aidlite.Config和aidlite.InterpreterBuilder等类实现模型管理和推理

OpenCV (代码中的 CV) 功能

OpenCV 是计算机视觉领域的经典库，在这段代码中主要用于以下功能：

[*]图像采集与处理

[*]通过cv2.VideoCapture获取摄像头实时视频流
[*]图像预处理：颜色空间转换 (cv2.cvtColor)、缩放 (cv2.resize) 等

[*]图像显示与可视化

[*]使用cv2.imshow显示处理后的图像
[*]绘制检测框 (cv2.rectangle)、关键点 (cv2.circle) 和连接线 (cv2.line)

[*]辅助计算功能

[*]计算手掌重心：cv2.moments计算图像矩
[*]边界框计算：cv2.boundingRect计算包围关键点的最小矩形

[*]图像操作

[*]图像翻转：使用cv2.flip处理前置摄像头的镜像问题
[*]区域提取：从原始图像中提取手部区域进行单独处理

OpenCV 提供的这些功能为 AI 模型的输入准备和输出结果可视化提供了基础支持，使整个系统能够实现从图像采集到结果展示的完整流程。
AI 模型作用介绍

代码中使用了两个 AI 模型协同工作：

[*]手掌检测模型 (palm_detection.tflite)

[*]这是一个轻量级的目标检测模型，专门用于检测图像中的手掌。
[*]模型输入：128×128 像素的 RGB 图像
[*]模型输出：包含手掌位置和边界框信息
[*]作用：快速定位图像中的手掌，为后续的关键点识别提供感兴趣区域 (ROI)
[*]加速方式：使用 GPU 加速，提高检测速度

[*]手部关键点检测模型 (hand_landmark.tflite)

[*]该模型对手部区域进行更精细的分析，识别 21 个关键点
[*]模型输入：224×224 像素的 RGB 图像（通常是手掌检测模型输出的 ROI）
[*]模型输出：21 个三维关键点坐标，表示手部的详细姿态
[*]作用：精确识别手指关节、指尖等部位的位置
[*]加速方式：使用 CPU 加速，保证识别精度

这两个模型结合使用，实现了从图像中检测手掌位置，到精确识别手部 21 个关键点的完整流程，能够实时跟踪手部动作和姿态。
DEMO代码

import cv2
import time
from time import sleep
import subprocess
import math
import sys
import numpy as np
from blazeface import *# 导入BlazeFace人脸/手部检测模型相关函数
import aidlite# AidLux平台的AI推理框架
import os

# 获取摄像头设备ID，优先选择USB摄像头
def get_cap_id():
try:
   # 构造命令，使用awk处理输出
   cmd = "ls -l /sys/class/video4linux | awk -F ' -> ' '/usb/{sub(/.*video/, \"\", $2); print $2}'"
   result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
   output = result.stdout.strip().split()

   # 转换所有捕获的编号为整数，找出最小值
   video_numbers = list(map(int, output))
   if video_numbers:
         return min(video_numbers)
   else:
         return None
except Exception as e:
   print(f"An error occurred: {e}")
   return None

# 图像预处理函数，将图像转换为适合TFLite模型输入的格式
def preprocess_image_for_tflite32(image, model_image_size=300):
try:
   image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)# 转换颜色空间从BGR到RGB
   image = cv2.resize(image, (model_image_size, model_image_size))# 调整图像大小
   image = np.expand_dims(image, axis=0)# 添加批次维度
   image = (2.0 / 255.0) * image - 1.0# 归一化处理，将像素值缩放到[-1,1]范围
   image = image.astype('float32')# 转换数据类型为float32
except cv2.error as e:
   print(str(e))
return image

# 在图像上绘制手部检测框
def plot_detections(img, detections, with_keypoints=True):
   output_img = img
   print(img.shape)
   x_min=# 存储两只手的最小x坐标
   x_max=# 存储两只手的最大x坐标
   y_min=# 存储两只手的最小y坐标
   y_max=# 存储两只手的最大y坐标
   hand_nums=len(detections)# 检测到的手的数量
   print("Found %d hands" % hand_nums)
   if hand_nums >2:
         hand_nums=2# 最多处理两只手
   for i in range(hand_nums):
         ymin = detections[ 0] * img.shape# 计算边界框的y最小值
         xmin = detections[ 1] * img.shape# 计算边界框的x最小值
         ymax = detections[ 2] * img.shape# 计算边界框的y最大值
         xmax = detections[ 3] * img.shape# 计算边界框的x最大值
         w=int(xmax-xmin)# 计算边界框宽度
         h=int(ymax-ymin)# 计算边界框高度
         h=max(h,w)# 取宽高的最大值
         h=h*224./128.# 调整高度尺寸

         x=(xmin+xmax)/2.# 计算中心点x坐标
         y=(ymin+ymax)/2.# 计算中心点y坐标

         # 调整边界框大小和位置
         xmin=x-h/2.
         xmax=x+h/2.
         ymin=y-h/2.-0.18*h
         ymax=y+h/2.-0.18*h

         # 存储边界框坐标
         x_min=int(xmin)
         y_min=int(ymin)
         x_max=int(xmax)
         y_max=int(ymax)
         p1 = (int(xmin),int(ymin))# 边界框左上角坐标
         p2 = (int(xmax),int(ymax))# 边界框右下角坐标
         cv2.rectangle(output_img, p1, p2, (0,255,255),2,1)# 在图像上绘制边界框

   return x_min,y_min,x_max,y_max

# 在图像上绘制手部网格关键点
def draw_mesh(image, mesh, mark_size=4, line_width=1):
"""Draw the mesh on an image"""
# The mesh are normalized which means we need to convert it back to fit
# the image size.
image_size = image.shape
mesh = mesh * image_size
for point in mesh:
   cv2.circle(image, (point, point),
               mark_size, (255, 0, 0), 4)

# 计算手掌的重心
def calc_palm_moment(image, landmarks):
image_width, image_height = image.shape, image.shape

palm_array = np.empty((0, 2), int)

# 收集手掌区域的关键点
for index, landmark in enumerate(landmarks):
   if math.isnan(landmark):
         landmark = 0
   if math.isnan(landmark):
         landmark = 0
   landmark_x = min(int(landmark * image_width), image_width - 1)
   landmark_y = min(int(landmark * image_height), image_height - 1)

   landmark_point =

   if index == 0:# 手腕1
         palm_array = np.append(palm_array, landmark_point, axis=0)
   if index == 1:# 手腕2
         palm_array = np.append(palm_array, landmark_point, axis=0)
   if index == 5:# 食指：根部
         palm_array = np.append(palm_array, landmark_point, axis=0)
   if index == 9:# 中指：根部
         palm_array = np.append(palm_array, landmark_point, axis=0)
   if index == 13:# 无名指：根部
         palm_array = np.append(palm_array, landmark_point, axis=0)
   if index == 17:# 小指：根部
         palm_array = np.append(palm_array, landmark_point, axis=0)

# 计算重心
M = cv2.moments(palm_array)
cx, cy = 0, 0
if M['m00'] != 0:
   cx = int(M['m10'] / M['m00'])
   cy = int(M['m01'] / M['m00'])

return cx, cy

# 计算包围手部关键点的矩形框
def calc_bounding_rect(image, landmarks):
image_width, image_height = image.shape, image.shape

landmark_array = np.empty((0, 2), int)

# 收集所有关键点坐标
for _, landmark in enumerate(landmarks):
   landmark_x = min(int(landmark * image_width), image_width - 1)
   landmark_y = min(int(landmark * image_height), image_height - 1)

   landmark_point =

   landmark_array = np.append(landmark_array, landmark_point, axis=0)

# 计算包围矩形
x, y, w, h = cv2.boundingRect(landmark_array)

return

# 在图像上绘制包围矩形
def draw_bounding_rect(use_brect, image, brect):
if use_brect:
   # 外接矩形
   cv2.rectangle(image, (brect, brect), (brect, brect),
                  (0, 255, 0), 2)

return image

# 在图像上绘制手部关键点和连接线
def draw_landmarks(image, cx, cy, landmarks):

image_width, image_height = image.shape, image.shape

landmark_point = []

# 绘制关键点
for index, landmark in enumerate(landmarks):
   # if landmark.visibility < 0 or landmark.presence < 0:
   # continue

   landmark_x = min(int(landmark * image_width), image_width - 1)
   landmark_y = min(int(landmark * image_height), image_height - 1)

   landmark_point.append((landmark_x, landmark_y))

   # 根据关键点类型绘制不同大小和颜色的点
   if index == 0:# 手腕1
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 1:# 手腕2
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 2:# 拇指：根部
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 3:# 拇指：第1关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 4:# 拇指：指尖
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
         cv2.circle(image, (landmark_x, landmark_y), 12, (0, 255, 0), 2)
   if index == 5:# 食指：根部
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 6:# 食指：第2关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 7:# 食指：第1关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 8:# 食指：指尖
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
         cv2.circle(image, (landmark_x, landmark_y), 12, (0, 255, 0), 2)
   if index == 9:# 中指：根部
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 10:# 中指：第2关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 11:# 中指：第1关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 12:# 中指：指尖
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
         cv2.circle(image, (landmark_x, landmark_y), 12, (0, 255, 0), 2)
   if index == 13:# 无名指：根部
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 14:# 无名指：第2关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 15:# 无名指：第1关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 16:# 无名指：指尖
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
         cv2.circle(image, (landmark_x, landmark_y), 12, (0, 255, 0), 2)
   if index == 17:# 小指：根部
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 18:# 小指：第2关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 19:# 小指：第1关节
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
   if index == 20:# 小指：指尖
         cv2.circle(image, (landmark_x, landmark_y), 5, (0, 255, 0), 2)
         cv2.circle(image, (landmark_x, landmark_y), 12, (0, 255, 0), 2)

# 绘制连接线
if len(landmark_point) > 0:
   # 拇指
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

   # 食指
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

   # 中指
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

   # 无名指
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

   # 小指
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

   # 手掌
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)
   cv2.line(image, landmark_point, landmark_point, (0, 255, 0), 2)

# 绘制重心点
if len(landmark_point) > 0:
   cv2.circle(image, (cx, cy), 12, (0, 255, 0), 2)

return image

# 初始化手掌检测模型
inShape =[]# 模型输入形状
outShape= [,]# 模型输出形状
model_path="models/palm_detection.tflite"# 手掌检测模型路径

# 创建Model实例对象，并设置模型相关参数
model = aidlite.Model.create_instance(model_path)
if model is None:
print("Create palm_detection model failed !")

# 设置模型属性
model.set_model_properties(inShape, aidlite.DataType.TYPE_FLOAT32, outShape,aidlite.DataType.TYPE_FLOAT32)

# 创建Config实例对象，并设置配置信息
config = aidlite.Config.create_instance()
config.implement_type = aidlite.ImplementType.TYPE_FAST# 快速推理实现类型
config.framework_type = aidlite.FrameworkType.TYPE_TFLITE# TensorFlow Lite框架类型
config.accelerate_type = aidlite.AccelerateType.TYPE_GPU# GPU加速
config.number_of_threads = 4# 线程数

# 创建推理解释器对象
fast_interpreter = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model, config)
if fast_interpreter is None:
print("palm_detection model build_interpretper_from_model_and_config failed !")

# 完成解释器初始化
result = fast_interpreter.init()
if result != 0:
print("palm_detection model interpreter init failed !")

# 加载模型
result = fast_interpreter.load_model()
if result != 0:
print("palm_detection model interpreter load model failed !")

print("palm_detection model load success!")

# 初始化手部关键点检测模型
model_path1="models/hand_landmark.tflite"# 手部关键点检测模型路径
inShape1 =[]# 模型输入形状
outShape1= [,,]# 模型输出形状

# 创建Model实例对象，并设置模型相关参数
model1 = aidlite.Model.create_instance(model_path1)
if model1 is None:
print("Create hand_landmark model failed !")

# 设置模型属性
model1.set_model_properties(inShape1, aidlite.DataType.TYPE_FLOAT32, outShape1,
                        aidlite.DataType.TYPE_FLOAT32)

# 创建Config实例对象，并设置配置信息
config1 = aidlite.Config.create_instance()
config1.implement_type = aidlite.ImplementType.TYPE_FAST# 快速推理实现类型
config1.framework_type = aidlite.FrameworkType.TYPE_TFLITE# TensorFlow Lite框架类型
config1.accelerate_type = aidlite.AccelerateType.TYPE_CPU# CPU加速
config.number_of_threads = 4# 线程数

# 创建推理解释器对象
fast_interpreter1 = aidlite.InterpreterBuilder.build_interpretper_from_model_and_config(model1, config1)
if fast_interpreter1 is None:
print("hand_landmark model build_interpretper_from_model_and_config failed !")

# 完成解释器初始化
result = fast_interpreter1.init()
if result != 0:
print("hand_landmark model interpreter init failed !")

# 加载模型
result = fast_interpreter1.load_model()
if result != 0:
print("hand_landmark model interpreter load model failed !")

print("hand_landmark model load success!")

# 加载锚点数据，用于模型推理
anchors = np.load('models/anchors.npy').astype(np.float32)

# 设置Aidlux平台类型和摄像头ID
aidlux_type="basic"
# 0-后置，1-前置
camId = 1
opened = False

# 尝试打开摄像头
while not opened:
if aidlux_type == "basic":
   cap=cv2.VideoCapture(camId, device='mipi')
else:
   capId = get_cap_id()
   print("usb camera id: ", capId)
   if capId is None:
         print ("no found usb camera")
         # 默认用1-前置摄像头打开相机，若打开失败，请尝试修改为0-后置
         cap=cv2.VideoCapture(1, device='mipi')
   else:
         camId = capId
         cap = cv2.VideoCapture(camId)
         cap.set(6, cv2.VideoWriter.fourcc('M','J','P','G'))

if cap.isOpened():
   opened = True
else:
   print("open camera failed")
   cap.release()
   time.sleep(0.5)

# 手检测标志和坐标初始化
bHand=False
x_min=
x_max=
y_min=
y_max=
fface=0.0
use_brect=True

# 主循环：持续捕获视频帧并进行手部检测和关键点识别
while True:

ret, frame=cap.read()# 读取一帧视频
if not ret:
   continue
if frame is None:
   continue

# 如果使用前置摄像头，水平翻转图像以获得自然的镜像效果
if camId==1:
   frame=cv2.flip(frame,1)

# 图像预处理，为手掌检测模型准备输入
img = preprocess_image_for_tflite32(frame,128)

# 手部检测和关键点识别流程
if bHand==False:
   # 设置输入数据
   result = fast_interpreter.set_input_tensor(0, img.data)
   if result != 0:
         print("palm_detection model interpreter set_input_tensor() failed")

   # 执行手掌检测模型推理
   result = fast_interpreter.invoke()
   if result != 0:
         print("palm_detection model interpreter invoke() failed")

   # 获取输出数据
   raw_boxes = fast_interpreter.get_output_tensor(0)
   if raw_boxes is None:
         print("sample : palm_detection model interpreter->get_output_tensor(0) failed !")

   classificators = fast_interpreter.get_output_tensor(1)
   if classificators is None:
         print("sample : palm_detection model interpreter->get_output_tensor(1) failed !")

   # 解析检测结果
   detections = blazeface(raw_boxes, classificators, anchors)

   # 在图像上绘制检测框并获取边界框坐标
   x_min,y_min,x_max,y_max=plot_detections(frame, detections)

   # 如果检测到至少一只手，则设置标志为True，准备进行关键点识别
   if len(detections)>0 :
         bHand=True

# 如果已检测到手部，进行关键点识别
if bHand:
   hand_nums=len(detections)
   if hand_nums>2:
         hand_nums=2

   # 对每只检测到的手进行关键点识别
   for i in range(hand_nums):

         print(x_min,y_min,x_max,y_max)
         # 确保边界框坐标在有效范围内
         xmin=max(0,x_min)
         ymin=max(0,y_min)
         xmax=min(frame.shape,x_max)
         ymax=min(frame.shape,y_max)

         # 提取手部区域
         roi_ori=frame
         # 预处理手部区域图像，为关键点检测模型准备输入
         roi =preprocess_image_for_tflite32(roi_ori,224)

         # 设置输入数据
         result = fast_interpreter1.set_input_tensor(0, roi.data)
         if result != 0:
            print("hand_landmark model interpreter set_input_tensor() failed")

         # 执行手部关键点检测模型推理
         result = fast_interpreter1.invoke()
         if result != 0:
            print("hand_landmark model interpreter invoke() failed")

         # 获取输出数据
         mesh = fast_interpreter1.get_output_tensor(0)
         if mesh is None:
            print("sample : hand_landmark model interpreter->get_output_tensor(0) failed !")

         # 重置手检测标志，准备下一帧的检测
         bHand=False

         # 处理关键点数据并在图像上绘制
         mesh = mesh.reshape(21, 3)/224
         cx, cy = calc_palm_moment(roi_ori, mesh)
         draw_landmarks(roi_ori,cx,cy,mesh)
         frame=roi_ori

# 显示处理后的图像
cv2.imshow("", frame)模型位置

/opt/aidlux/app/aid-examples//hand_track模型效果

来源：豆瓜网用户自行投稿发布，如果侵权，请联系站长删除

页: [1]

豆瓜网's Archiver

高通手机跑AI系列之——手部姿势跟踪