Nedved Yang – Medium

Nedved Yang

Pinned

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Simple Local Testing of Llama 3 on Its Release, Gemma, and Mistral Mistral

Upon the release of Llama 3, I conducted tests on three models locally on my 8G RAM M1 Macbook: gemma:2b (I would have preferred to use…

Apr 21, 2024

Geek Out Time: Simple Local Testing of Llama 3 on Its Release, Gemma, and Mistral Mistral

Apr 21, 2024

Pinned

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Play with LangChain 3 — Simulate Full RAG Locally with word2vec and Gemma

With the Apple researchers’ unveiling of ReALM, following Gemma from Google, Llama from Meta, and a couple of others from Microsoft…

Apr 6, 2024

Geek Out Time: Play with LangChain 3 — Simulate Full RAG Locally with word2vec and Gemma

Apr 6, 2024

Pinned

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Build a Facade API for OpenAI API and Local LLM API

In 2024, OpenAI leads the Generative AI sector, favored for its pioneering, easy-to-use API and the advanced GPT-4. However, factors like…

Mar 29, 2024

Geek Out Time: Build a Facade API for OpenAI API and Local LLM API

Mar 29, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: AI in the Browser- Run WebLLM for Powerful, Local LLM Experiences

WebLLM brings Large Language Models (LLMs) directly into your browser, leveraging WebGPU for on-device GPU computation. In this updated…

Dec 21, 2024

Geek Out Time: AI in the Browser- Run WebLLM for Powerful, Local LLM Experiences

Dec 21, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Exploring Opensource AnythingLLM — The All-in-One, Easy AI Platform for Local RAG…

What is AnythingLLM?

Dec 8, 2024

Geek Out Time: Exploring Opensource AnythingLLM — The All-in-One, Easy AI Platform for Local RAG…

Dec 8, 2024

Published in
The Constellar Digital&Technology Blog

Exploring LoRA on Google Colab: the Challenges of Base Model Upgrades

How to address the challenge of the need to retrain models whenever the base model is upgraded? Retraining can be computationally…

Dec 5, 2024

Exploring LoRA on Google Colab: the Challenges of Base Model Upgrades

Dec 5, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Building an Interactive Career Coach with Google Colab and Synthesizing Agents Using…

In my previous…

Nov 9, 2024

Geek Out Time: Building an Interactive Career Coach with Google Colab and Synthesizing Agents Using…

Nov 9, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Simulating High-Bandwidth and Cache-Like Memory on Google Colab’s T4 GPU

This week, we’re delving into the GPU memory hierarchies and exploring how different types of memory impact the performance of transformer…

Nov 6, 2024

Geek Out Time: Simulating High-Bandwidth and Cache-Like Memory on Google Colab’s T4 GPU

Nov 6, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Exploration of Model Pruning for Efficient Deployment

This week, I’m exploring model pruning — a technique that removes unimportant weights from a neural network to make it more efficient…

Nov 2, 2024

Geek Out Time: Exploration of Model Pruning for Efficient Deployment

Nov 2, 2024

Published in
The Constellar Digital&Technology Blog

Geek Out Time: Experimenting with FP32 vs. FP16 Quantization on Google Colab’s Free T4 GPU

Today, I’m diving into quantization — a technique to optimize deep learning models by reducing their precision. Specifically, I’m…

Oct 31, 2024

Geek Out Time: Experimenting with FP32 vs. FP16 Quantization on Google Colab’s Free T4 GPU

Oct 31, 2024

Nedved Yang

Nedved Yang

Passionate about using technology to make impacts.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams