Models that modify Probability Distributions, I/O for Models via Web Protcols

3 min readMar 5, 2025

LLMs are models that modify the probability distribution of a set of events, and predict the next event.

In this case, the event is the next token. The token can be a word or a subword.

The consideration around subwords allows us to account for vocabulary that is not in the original vocabulary list, allowing us to work with foreign words and proper nouns.

For the model named as GPT2LLMHead, the input can be a text phrase or a numerical list of token ids.

When the input is a list of token ids, the output, let’s call it x, has several attributes.

The logits are a probability distribution over every token that the model is trained on. Is a larger token size vocabulary brute forcing? Even if it's brute forcing, is it helpful? A larger token size means the probability distribution at each inference step is over more events, increase compute.

The number of total tokens used to train the LLM.

The maximum number of tokens in one forward pass.

ML Ops — Passing data to ML Models via internet protocols: http



import requests

# URL of a streaming or large audio file
audio_url = "https://sample-videos.com/audio/mp3/crowd-cheering.mp3"

response = requests.get(audio_url, stream=True)

# Save the streamed audio to a file
with open("streamed_audio.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=1024):  # Process chunks of 1KB
        if chunk:
            f.write(chunk)  # Save to file

print("Audio streaming complete. File saved as streamed_audio.mp3")

Above snippet shows how to work with streaming data using the requests library.

Load Balancing Algorithms in NGINX

Passing data from web server to application server:

HTTP, text-based, to FASTCGI (or alternative), binary protocol

Another protocol: Memcached

To use Memcached, install the software, and then install the python client, i.e. python interface. Memcached has only a few methods: get, set, delete, increment a value up, decrease a value by 1, and a couple others.

grpc is an http/2 based protocol:

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Llm

Model

Probability Distribution

Written by Anudha Mittal

5 Followers

43 Following

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

How To Train Your PyTorch Models With Less Memory

Level Up Coding

Sahib Dhanjal

How To Train Your PyTorch Models With Less Memory

Strategies I regularly use to reduce GPU memory consumption by almost 20x

Feb 24

355

A Unified Machine Learning Framework for Time Series Forecasting

Data Science Collective

Shenggang Li

A Unified Machine Learning Framework for Time Series Forecasting

Harness Diverse Algorithms to Improve Predictive Accuracy from Transactional Data

4d ago

409

Lists

Natural Language Processing

1977 stories1619 saves

ChatGPT prompts

51 stories2642 saves

Moderation in small LLMs: How not to get sued!

alejandro

Moderation in small LLMs: How not to get sued!

Avoid getting sued by implementing a moderation system from scratch with Ollama.

3d ago

🚅 Information Theory for People in a Hurry

Towards AI

Eyal Kazin PhD

🚅 Information Theory for People in a Hurry

A quick guide to Entropy, Cross-Entropy and KL Divergence. Python code provided. 🐍

6d ago

178

Get with the Times: PyTabKit for better Tabular Machine Learning over Sk-Learn (CODE Included)

Writing in the World of Artificial Intelligence

Abish Pius

Get with the Times: PyTabKit for better Tabular Machine Learning over Sk-Learn (CODE Included)

For too long has Scikit-Learn been the go-to library for machine learning on tabular data, offering a broad collection of algorithms…

Feb 27

Intuitively and Exhaustively Explained

Daniel Warfield

Dropout — Intuitively and Exhaustively Explained

Encouraging robust learning in AI

2d ago

102

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams