Senior Machine Learning Engineer, Careers At TETRAMEM INC

Careers At TETRAMEM INC

Share with friends or Subscribe!

At TetraMem, we are redefining the future of AI with our groundbreaking innovations in In-Memory Computing. Leveraging world-record multi-level RRAM technology, we deliver highly efficient solutions for AI computations, enabling superior performance and energy efficiency across applications ranging from edge devices to data centers. Our talented team of engineers and industry-leading executives drives this progress, making TetraMem a leader in advanced memory technologies.

If you are passionate about cutting-edge technology and thrive in a fast-paced, collaborative environment, TetraMem is the place for you. Join our global team to shape the future of AI computations and sustainable technology solutions while working at the forefront of innovation. Together, we can make a lasting impact.

Are you ready for new challenges and new opportunities?

Join our team!

Current job opportunities are posted here as they become available.

Subscribe to our RSS feeds to receive instant updates as new positions become available.

Back To Openings

Senior Machine Learning Engineer

Department:	Software - ML & Algorithm
Location:	San Jose, CA

START YOUR APPLICATION

Responsibilities:

Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.
Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.
Work closely with hardware and software teams to integrate ML models into production systems.
Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications.
Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.
Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.
Provide technical leadership and mentorship to junior engineers.
Publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements:

5+ years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields.
Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices.
Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX.
Proficiency in Python and C/C++, with practical experience in ML model optimization and production deployment.
Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference.
Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar).
Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption).

Experience in one or more of the following areas considered a strong plus:

Understanding of ML compiler and runtime design.
Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.
Familiarity with hardware acceleration techniques.
Experience in embedded system development.

Salary Range: $200,000 - $280,000 / year

START YOUR APPLICATION

Visit Our Home Page