Date of Graduation

Spring 2026

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chairperson

Md Amiruzzaman, PhD

Committee Member

Stefanie Amiruzzaman, PhD

Committee Member

Linh Ngo, PhD

Abstract

This thesis investigates the deployment of high-accuracy Isolated ASL Recognition (ISLR) in resource-constrained edge environments. We train a lightweight Spatio-Temporal Attention Network (SSTAN,∼2.7 M parameters,∼10 MB) on the WLASL-100 benchmark, achieving 75.25% Top-1 and 88.24% Top-5 accuracy with 139 ms CPU-only inference. A systematic comparison against frontier multimodal LLMs (Gemini 3 Flash, Gemini 3.1 Pro, Qwen 3 VL) shows SSTAN outperforms the best LLM baseline by∼1.85×in accuracy while being 22–230×faster and up to 40×cheaper annually. The LLMs’ core limitation is a lack of fine-grained temporal perception; they impose English-language semantic priors rather than learning the articulatory distinctions that define ASL signs. To demonstrate practical impact, we integrate the model into a browser-based ASL Word Search Game where users sign words via webcam instead of typing, grounding vocabulary practice in embodied, gesture-driven interaction.

Final Version Confirmation

1

Share

COinS