Kevin Coder

Build a production-grade RAG or similarity search using Qdrant and Langchain

Kevin Naidoo — Wed, 17 Apr 2024 06:43:04 GMT

In a recent article, I wrote about using Langchain and FAISS to build a RAG system. FAISS, however, is in memory and not scalable when you want to update the training data constantly in real-time.

A great alternative to FAISS is Qdrant, and in this article, we will look at building a basic similarity search just using Langchain and Qdrant.

If you need a RAG framework or starter kit, great news! I am working on something currently and will be open sourcing soon. See end of this article for more details and please subscribe to my newsletter to get notified when the toolkit is available.

What is Qdrant?

In the world of machine learning these days, it's a common practice to vectorize text and store it in a backend DB, unfortunately, most modern databases such as MySQL are not designed to store vector embeddings, although things move fast in the machine learning world and support is getting better, vectors are not yet a first-class citizen for RDBMS type databases.

Qdrant is a modern database that's optimized for storing unstructured data, more especially vector embeddings. Besides being a vector store, you can also use Qdrant as a search engine, it lacks features such as faceting but still is pretty useful on longtail phrase type searches.

Take for example: "Please suggest a high-end gaming machine with lots of RAM and a good graphics card ", a regular keyword search might return no results or poor results.

I tried this on Amazon.com and got weird results for: RAM, mini PCs, and socks 😂:

Naturally, this confuses a keyword-based search engine because it's looking for specific keywords or their synonyms. On Qdrant, you can most probably find "gaming laptops" for the same above search. Since the vector-based search also stores "meaning" and uses algorithms such as "kNN" with cosine similarity to better understand the searcher's intent.

If you don't know what vector embeddings are: they are basically a numerical representation of text, that machine learning models can use to perform math calculations, in-order to determine meaning and relationships between words and phrases. Qdrant's website provides a more in depth article on this subject as well, you can learn more here.

Setting up Qdrant

Similar to most NoSQL databases, Qdrant uses the concept of "collections". Essentially a collection is a unique grouping of documents, similar to an SQL table.

The easiest way to get started with Qdrant, is to use the docker image:

docker run -dit -p 6333:6333 qdrant/qdrant

You should now be able to view the web dashboard by visiting the following link: http://127.0.0.1:6333/dashboard

Before we get to the code, you will need to install a few pip packages:

pip install langchainpip install qdrant_clientpip install langchain-communitypip install langchain-openai

Setting up our schema:

from qdrant_client import QdrantClientfrom qdrant_client.models import Distance, VectorParamsfrom langchain.embeddings import OpenAIEmbeddingsfrom langchain_community.vectorstores import Qdrantembeddings = OpenAIEmbeddings()client = QdrantClient('http://127.0.0.1:6333')langchain_qdrant_store = Qdrant(    embeddings=embeddings,    client=client,    collection_name="products")# Should only run this onceclient.create_collection(    collection_name="products",    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),)

When you run the above, it should create a collection named: "products" with one field that will house our vector embedding. For the purposes of this article and to keep things simple, I am just storing the vector embeddings and no other extra meta information.

You can most certainly, store more information using "payloads" if needed. Learn more about payloads here.

Storing documents in the collection

Storing documents in the collection is fairly straightforward when using the "OpenAIEmbeddings" and Langchain Documents. Langchain will automatically take care of the complexity of querying OpenAI and generating the vector embeddings.

So basically all you need to do, in order to index a document in Qdrant is as follows:

from langchain.docstore.document import Documentdoc = Document(page_content="Some text data here")langchain_qdrant_store.add_documents([doc])

Finding similar documents

To perform a KNN search we simply need to do the following:

documents = langchain_qdrant_store.similarity_search("Some search term")

The above will return a list of "Langchain documents", which you can pass on to an LLM or process further just like you would any other Python object in your application.

Using the Qdrant store as a retriever

Lanchain is awesome and can simplify the process of passing on this information to an LLM for further reasoning, see the example below:

from langchain.chains import RetrievalQAfrom langchain.chat_models import ChatOpenAIllm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0)chain = RetrievalQA.from_chain_type(    llm=llm,    chain_type="stuff",    retriever=langchain_qdrant_store.as_retriever())print(chain({"query": question}))

Whenever you pass a question to the LLM, it will automatically query Qdrant for similar documents and then determine its response accordingly based on that context data.

This is quite powerful and is known as "RAG" (retrieval-augmented generation). Remember with ChatGPT and other such models, it's often possible for the LLM to hallucinate or provide a response that may be correct but not as in-depth as you may like.

With RAG, you are providing the context data and can scope the LLM to only respond based on the data you provide, this reduces the model's likelihood to hallucinate, and you can enforce an "I don't know" rule or something to that effect when it cannot answer the question effectively.

What can I build with this kind of system?

Some common use cases include:

A summarization tool. You can vectorize hundreds or thousands of documents into Qdrant and then have the LLM summarize and answer questions based on that data. This is useful in various business applications: such as training, onboarding, customer support, and so forth.
Insights tool. Ingest large amounts of statistics and then build an interface with various charts to predict trends and understand your data better. Crime intelligence is a good example.
Suggestion engine. A great use case would be e-commerce, the chatbot can help users with their purchases by providing more information and suggesting other products that may go well together and so forth.
Similarity search. You don't need a full RAG for this, since it'll be slow and use lots of resources if you have a high volume of searches. Qdrant alone is sufficient though using the "similarity_search" function.

These are just a handful of suggestions, I am sure there are loads more interesting applications that can be built using this platform.

I am working on a toolkit to make RAG easier!

After building a few of these RAG-type applications; I realized it's not hard but cumbersome to set up everything for production. To make my life easier, I have set up a Django-based project that includes all the essentials.

Here is what's included:

Full RAG implementation with Qdrant and OpenAI.
Range of different classifiers to help you with categorization and tagging tasks.
Agent tooling, so that you can build multi-step Chatbots, that can reach out to various sources like a Google Search or an API call and so forth.
Admin Dashboard, full authentication including 2-factor auth, and more. Not using Django admin, this is a custom template.
Docker-compose files for dev and production.
Scriptables integration (My open-source server builder: https://plexscriptables.com). This will allow you to deploy the application in minutes to a VPS provider like Digital Ocean without having to worry about SSH hardening, firewalls, messing with Linux configs, etc...
Many more machine-learning tasks and examples are included as well.

I will be open-sourcing the tool. If this sounds interesting to you, please subscribe below to my newsletter and you will be notified in the next few months when this tool is ready.

LLM Super Powers with Langchain Agents

Kevin Naidoo — Thu, 11 Apr 2024 15:51:28 GMT

Langchain is one of my favorite tools at the moment because of how it simplifies complex machine-learning tasks.

In some cases you may need more than just a well-written prompt, you may want to trigger different data sources or actions based on what the user asks.

For example: in an e-commerce site, if the user asks to view a list of "shoes", you probably would do a keyword/semantic search and return a list of matching shoes.

If they ask about delivery information, you may want to look up that data from an SQL DB or API.

Langchain agents are a powerful mechanism at your disposal that will enable you to build complex custom LLM chatbots. In this article, we will go over what is an agent and how to build one.

What is a Langchain Agent?

When you use an LLM like OpenAI's "chatgpt-turbo", you typically will send the model a prompt consisting of one or more messages, and the LLM will respond accordingly.

So essentially text in and text out, this is okay for a chatbot that is performing one particular task like answering support questions, but what if you need to change the data source or do a web search or record something in the DB?

This is where Agents come in handy; an Agent is basically a "task executor". It will allow the LLM to execute functions and other code in your application based on reasoning and user input.

Think in terms of a switch statement, depending on which pathway is true, the switch statement will execute that particular block of code.

Agents are not exclusive to Langchain. Each LLM has a different way of handling agents, however, Langchain just provides a consistent API to work with regardless of which backend LLM you are using.

Tools

Since agents are task executors, we need some kind of "callback" for the agent to execute such as a function or class.

Tools take in, either the raw prompt or a list of arguments and return some sort of output. Usually, you would return a string, but it's also possible to return more complex data like a LangChain document.

There are multiple ways of declaring a tool. We will cover the decorator approach since it's the most common and easiest solution to understand.

Here is an example:

from langchain.tools import tool@tooldef search_delivery_information(prompt :str) -> str:    """ When the user requests delivery information """    # We now return some text from an external API    # The LLM will analyze this text and     # - return the appropriate answer to the user.    return requests.get("/somewhere/delivery.json").text

Three essential components make up a tool:

@tool - This decorator will take care of handling the input/output of your function in a way that the LLM can understand.
""" - The docstring; think of this as a prompt system message to the LLM, where you tell the LLM when to execute this function and provide any other useful context data.
The returned data. The LLM will ingest your function's response as context data and scope its response to that context.

To clarify what I mean by "scope", without any context data, if the user asks: "Which is the best shoe brand?", it will respond with some generic response based on the LLMs training data, similar to asking ChatGPT a question, so probably it may respond with "Nike" or "Reebok".

However, if the tool returns "Addidas is our best brand.", then the LLM's response will regard "Addidas" as the best brand and not "Nike" or "Reebok".

Putting it all together

Okay great! Now you know what an agent is and how to create your custom callback functions to help the LLM better answer the user's question.

Let's now build an Agent and link it to our custom tool:

from langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholderfrom langchain.agents.format_scratchpad.openai_tools import (    format_to_openai_tool_messages,)from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser# Create a standard Chat LLMllm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0)# Create a list of all tools you want to enabletools = [search_delivery_information]# Connect our tools to the LLMllm_with_tools = llm.bind_tools(tools) # Build a chat prompt.# Notice we have placeholders for the user's input# - and a second placeholder for the Agent's context data.prompt = ChatPromptTemplate.from_messages(        [            (                "system",               f"You are an e-commerce assistant.",            ),            ("user", "{input}"),           MessagesPlaceholder(variable_name="agent_ctx"),        ])# Next we build the actual agent.agent = (        {            "input": lambda x: x["input"],            "agent_ctx": lambda x: format_to_openai_tool_messages(                x["intermediate_steps"]            )        }        | prompt        | llm_with_tools        | OpenAIToolsAgentOutputParser())

  {      "input": lambda x: x["input"],      "agent_ctx": lambda x: format_to_openai_tool_messages(           x["intermediate_steps"]      )  }

The above block might seem confusing at first glance, but basically, the first argument is the user's input, the prompt template created earlier will replace "{input}" with the actual user's question or message.

Secondarily: "agent_ctx", since our tool callback functions are just Python functions, there needs to be a translation step that converts the output from these functions into something that the model can understand and the agent can transmit via the REST API.

You will also notice we chain one other object at the end "OpenAIToolsAgentOutputParser", this will receive data and convert it into a format that the agent can understand.

Finally, we can instantiate an agent executor and prompt the LLM:

from langchain.agents import AgentExecutoragent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)result = agent_executor.invoke({"input": question})print(result)

Wait! What about memory?

Naturally, users will ask follow-up questions and the LLM needs to be aware of these to provide a much more accurate answer.

Luckily, Langchain makes managing memory so much easier. Here is an example:

from myapp.models import Messagefrom langchain.prompts import MessagesPlaceholderfrom langchain_openai import ChatOpenAIfrom langchain_core.messages import AIMessage, HumanMessagefrom langchain_core.prompts import (ChatPromptTemplate,MessagesPlaceholder)from langchain.agents.format_scratchpad.openai_tools import (    format_to_openai_tool_messages,)from langchain.agents.output_parsers.openai_tools import (OpenAIToolsAgentOutputParser)# ... other agent code as per above snippetsprompt = ChatPromptTemplate.from_messages(    [        (            "system",            f"You are an e-commerce assistant.",        ),        # Add new variable to replace with previous chat messages.        MessagesPlaceholder(variable_name="chat_history"),         ("user", "{input}"),        MessagesPlaceholder(variable_name="agent_ctx"),     ])# Build a list of messages from the DB.chat_history = []# Very simple query, you may want to limit to# - the last 10 conversations or something to that effect.messages = Message.objects.filter(is_deleted=False)for m in messages:    if m.model_answer is not None:        chat_history.append(            [                HumanMessage(content=m.user_question),                AIMessage(content=m.model_answer),            ]         )agent = (    {        "input": lambda x: x["input"],        "agent_ctx": lambda x: format_to_openai_tool_messages(            x["intermediate_steps"]        ),        "chat_history": lambda x: x["chat_history"],    }    | prompt    | llm    | printer    | OpenAIToolsAgentOutputParser())

In the above code, we first add a variable to the original prompt template: "chat_history" and then in our agent pipeline we extract the messages from the list constructed above in "chat_history".

Bleve: How to build a rocket-fast search engine?

Kevin Naidoo — Tue, 26 Mar 2024 16:54:14 GMT

Go/Golang is one of my favorite languages; I love the minimalism and how clean it is, it's very compact syntax-wise and tries very hard to keep things simple (I am a big fan of the KISS principle).

One of the major challenges I faced in recent times is building a fast search engine. Sure there are options such as SOLR and ElasticSearch; both work really well and are highly scalable, however, I needed to simplify search, by making it faster and easier to deploy with little to no dependencies.

I needed to optimize enough for me to return results quickly so that they could be re-ranked. While C/Rust might be a good fit for this, I value development speed and productivity. Golang is the best of both worlds I guess.

In this article, I will go through a simple example of how you can build your own search engine using Go, you will be surprised: it's not as complicated as you may think.

Golang: Python on steroids

I don't know why, but Golang feels like Python in a way. The syntax is very easy to grasp, maybe it's the lack of semicolons and brackets everywhere or the lack of ugly try-catch statements. Maybe it's the awesome Go formatter, I don't know.

Anyway, since Golang generates a single self-contained binary, it's super easy to deploy to any production server. You simply "go build" and swap out the executable.

Which is exactly what I needed.

Do you Bleve?

No, that's not a typo 🙂. Bleve is a powerful, easy-to-use, and very flexible search library for Golang.

While as a Go developer, you generally avoid 3rd party packages like the plague; sometimes it makes sense to use a 3rd party package. Bleve is fast, well-designed, and provides sufficient value to justify using it.

In addition, here is why I "Bleve":

Self-contained, one of the big advantages of Golang is the single binary, so I wanted to maintain that feel and not need an external DB or service to store and query documents. Bleve runs in memory and writes to disk similar to Sqlite.
Easy to extend. Since it's just Go code, I can easily tweak the library or extend it in my codebase as needed.
Fast: Search results across 10 million+ documents take just 50-100ms, this includes filtering.
Faceting: you cannot build a modern search engine without some level of faceting support. Bleve has full support for common facet types: like ranges or simple category counts.
Fast indexing: Bleve is somewhat slower than SOLR. SOLR can index 10 million documents in 30 minutes, while Bleve takes over an hour, however, an hour or so is still pretty decent and fast enough for my needs.
Good quality results. Bleve does well with the keyword results but also some semantic-type searches work really well in Bleve too.
Fast startup: If you need to restart or deploy an update, it takes mere milliseconds to restart Bleve. There is no blocking of reads to rebuild the index in memory, so searching the index is possible without hiccups just milliseconds after a restart.

Setting up an index?

In Bleve, an "Index" can be thought of as a database table or a collection (NoSQL). Unlike a regular SQL table, you do not need to specify every single column, you basically can get away with the default schema for most use cases.

To initialize a Bleve index, you can do the following:

mappings := bleve.NewIndexMapping()index, err = bleve.NewUsing("/some/path/index.bleve",    mappings, "scorch", "scorch", nil)if err != nil {   log.Fatal(err)}

Bleve supports a few different index types, but I found after much fiddling that the "scorch" index type gives you the best performance. If you don't pass in the last 3 arguments, Bleve will just default to BoltDB.

Adding documents

Adding documents to Bleve is a breeze. You basically can store any type of struct in the index:

type Book struct {    ID                     int      `json:"id"`    Name                   string   `json:"name"`    Genre                  string   `json:"genre"`}b := Book{ID: 1234, Name: "Some creative title", Genre: "Young Adult"}idStr := fmt.Sprintf("%d", b.ID)// index(string, interface{})index.index(idStr, b)

If you are indexing a large amount of documents, it's better to use batching:

// You would also want to check if the batch exists already// - so that you don't recreate it.batch := index.NewBatch()if batch.Size() >= 1000 {    err := index.Batch(batch)    if err != nil {        // failed, try again or log etc...    }    batch = index.NewBatch()} else {   batch.index(idStr, b)}

As you will notice, a complex task like batching records and writing them to the index is simplified using "index.NewBatch" which creates a container to index documents temporarily.

Thereafter you just check the size as you loop along and flush the index once you reach the batch size limit.

Searching the index

Bleve exposes multiple different search query parsers that you can choose from depending on your search needs. To keep this article short and sweet, I am just going to use the standard query string parser.

searchParser :=  bleve.NewQueryStringQuery("chicken reciepe books")maxPerPage := 50ofsset := 0searchRequest := bleve.NewSearchRequestOptions(searchParser, maxPerPage, offset, false)// By default bleve returns just the ID, here we specify// - all the other fields we would like to return.searchRequest.Fields = []string{"id", "name", "genre"}searchResults, err := index.Search(searchResult)

With just these few lines, you now have a powerful search engine that delivers good results with a low memory and resource footprint.

Here is a JSON representation of the search results, "hits" will contain the matching documents:

"status": {"total": 5,"failed": 0,"successful": 5},"request": {},"hits": [],"total_hits": 19749,"max_score": 2.221337297308545,"took": 99039137,"facets": null}

Faceting

As mentioned earlier, Bleve provides full faceting support out of the box without having to set these up in your schema. To Facet on the book "Genre" for example, you can do the following:

//... build searchRequest -- see previous section.// Add facetsgenreFacet := bleve.NewFacetRequest("genre", 50)searchRequest.AddFacet("genre", genreFacet)searchResults, err := index.Search(searchResult)

We extend our searchRequest from earlier with just 2 lines of code. The "NewFacetRequest" takes in 2 arguments:

Field: the field in our index to facet on (string).
Size: the number of entries to count (integer). Thus in our example, it will only count the first 50 genres.

The above will now fill the "facets" in our search results.

Next, we simply just add our facet to the search request. Which takes in a "facet name" and the actual facet. "Facet name" is the "key" you will find this result set under in our search results.

Advanced queries and filtering

While the "QueryStringQuery" parser can get you quite a bit of mileage; sometimes you need more complex queries such as "one must match" where you would like to match a search term against several fields and return results so long as at least one field matches.

You can use the "Disjunction" and "Conjunction" query types to accomplish this.

Conjunction Query: Basically, it allows you to chain multiple queries together to form one giant query. All child queries must match at least one document.
Disjunction Query: This will allow you to perform the "one must match" query mentioned above. You can pass in x amount of queries and set how many child queries must match at least one document.

Disjunction Query example:

  djQuery := bleve.NewDisjunctionQuery()  fields := []string{"title", "description", "author"}  // At least 1 of the queries below must match at least one document.  djQuery.Min = 1  for _, field := range fields {       query := bleve.NewMatchQuery(searchTerm)       // Tell the query which field to match against.       query.setField(field)       djQuery.AddQuery(query)  }searchRequest := bleve.NewSearchRequestOptions(djQuery, maxPerPage, offset, false)

Similar to how we used "searchParser" earlier, we can now pass the "Disjunction Query" into the constructor for our "searchRequest".

While not exactly the same, this resembles the following SQL:

SELECT docs FROM docs WHERE title LIKE '%x%' OR description LIKE '%x%'OR author LIKE '%x%';

You can also adjust how fuzzy you want the search to be by setting "query.Fuzziness=[0 or 1 or 2]"

Conjunction Query Example:

cjQuery := bleve.NewConjunctionQuery()// Keyword searchsearchQuery := bleve.NewMatchQuery(searchTerm)searchQuery.setField("name")cjQuery.addQuery(searchQuery)// Price range searchminPrice := 100maxPrice := 200priceQuery := bleve.NewNumericRangeQuery(&minPrice, &maxPrice)priceQuery.setField("price")cjQuery.AddQuery(priceQuery)searchRequest := bleve.NewSearchRequestOptions(cjQuery, maxPerPage, offset, false)

You will notice the syntax is very similar, you can basically just use the "Conjunction" and "Disjunction" queries interchangeably.

This will look similar to the following in SQL:

SELECT docs FROM docs WHERE title LIKE '%x%' AND (price >= ? and price < ?)

In summary; use the "Conjunction Query" when you want all child queries to match at least one document and the "Disjunction Query" when you want to match at least one child query but not necessarily all child queries.

Sharding

If you run into speed issues, Bleve also makes it possible to distribute your data across multiple index shards and then query those shards in one request, for example:

searchShardHandler := bleve.NewIndexAlias()searchShardHandler.Add(indexes...)searchShardHandler.Search(searchRequest)

Sharding can become quite complex, but as you see above, Bleve takes away a lot of the pain, since it automatically "merges" all the indexes and searches across them, and then returns results in one resultset just as if you searched a single index.

I have been using sharding to search across 100+ shards. The whole search process completes in a mere 100-200 milliseconds on average.

You can create shards as follows:

var indexes []bleve.Indexvar i := 1for {    if i >= 5 {       return    }    indexShardName := fmt.Sprintf("/some/path/index_%d.bleve", i)    index, err = bleve.NewUsing(indexShardName, mappings,     "scorch", "scorch", nil)    if err == nil {       indexes = append(indexes, index)    }}

Just be sure to create unique IDs for each document or have some sort of predictable way of adding and updating documents without messing up the index.

An easy way to do this is to store a prefix that contains the shard name in your source DB, or wherever you get the documents from. So that every time you try to insert or update, you look up the "prefix" which will tell you which shard to call ".index" on.

Speaking of updating, simply calling "index.index(idstr, struct)" will update an existing document.

Conclusion

Using just this basic search technique above and putting it behind GIN or the standard Go HTTP server, you can build quite a powerful search API and serve millions of requests without needing to roll out complex infrastructure.

One caveat though; Bleve does not cater for replication, however, since you can wrap this in an API. Simply have a cron job that reads from your source and "blasts" out an update to all your Bleve servers using goroutines.

Alternatively, you can just lock writing to disk for a few seconds and then just "rsync" the data across to slave indexes, although I don't advise doing so because you probably would also need to restart the go binary each time.

How to build a basic API using FastAPI?

Kevin Naidoo — Wed, 20 Mar 2024 20:35:47 GMT

Back in the day, if you needed to build a website or API in Python, the choices would be Flask or Django. Both are excellent options no doubt, however in recent times for APIs and machine learning backends, FastAPI seems to be the most popular choice.

I won't go into a detailed comparison between these 3 frameworks, since it's beyond the scope of this article, however, for building APIs I prefer FastAPI because of its compact nature, speed, and Pydantic support, not to mention asynchronous programming is a breeze as well.

What is Pydantic anyway?

Before we dive into FastAPI, let's take a moment to appreciate Pydantic. Pydantic is a neat library used to type-hint and model data essentially.

Although Python is strongly typed, types are inferred, Python 3 allows and supports type hinting however hinting is not required.

With Pydantic you can build simple classes to model your API data, this then allows you to structure your data better and makes it so much easier to perform validation on data.

Here is an example:

from pydantic import BaseModelclass PromptInput(BaseModel):    model: str    stream: bool    prompt: str    system: str    options: dict

In FastAPI, you can then use this class in your function signatures to parse the incoming request data and perform validation on that data:

def promptLLM(request: PromptInput):

Setting up

Like with most things in Python, setting up FastAPI is a breeze. Simply run the following:

pip install fastapipip install uvicorn

Uvicorn supports the newer "ASGI" standard, making it super fast and perfect for running asynchronous code. Furthermore, the FastAPI docs recommend Uvicorn over Gunicorn.

I generally use Uvicorn for machine learning APIs and thus prefer Unvicorn for its async support. Gunicorn should also work just fine for most use cases with FastAPI, in fact, some even use Gunicorn as a process manager in front of Unvicorn.

Now that you have FastAPI and Unvicorn installed, you simply need to create an entry point for your app:

myproject    - __init__.py    - main.py

You can name your entry point script "api" or "main" or whatever you like, just so long as this is the same prefix you use when running Uvicorn.

Inside the main file, you should instantiate FastAPI:

from fastapi import FastAPI,Requestfrom fastapi.responses import JSONResponseapp = FastAPI()@app.get("/api/v1/hello")def hello_world(request: Request):    params = dict(request.query_params.items())    return JSONResponse(content=params)

In the above example, we can use REST verbs such as "get", "put", "post", "delete" and so forth to reference decorator methods available on our FastAPI instance. We then also pass in a string argument to bind the "hello_world" function to the route: "/api/v1/hello".

Notice we also use type hinting to tell our "hello_world" function that it should expect one injected argument, which is essentially the "Request" object, you can also use a custom Pydantic class instead of the standard Request class.

Finally, we just grab all the "GET" arguments into "params" and return them to the browser in the form of JSON.

To run your FastAPI app:

uvicorn --reload --port 5000 main:app

Now visit http://127.0.0.1:5000/api/v1/hello?say=Hi in your browser and you should see the following JSON returned:

{"say":"Hi"}

How simple was that? A fully functional API in just a few lines of code.

Returning HTML

FastAPI is predominantly used for APIs; when building a full backend APP with templates, you probably would be better off using Django.

FastAPI is very minimal and you are going to have to manually configure and bring in a whole bunch of packages to get your app to the same level that you get from Django out of the box.

Nonetheless, it's not that hard to serve up HTML content from FastAPI. Here is an example of rendering Jinja2 templates:

from fastapi import FastAPI,Requestfrom fastapi.responses import HTMLResponsefrom fastapi.templating import Jinja2Templatesapp = FastAPI()templates = Jinja2Templates(directory="templates")@app.get("/api/v1/hello", response_class=HTMLResponse)def hello_world(request: Request):    params = dict(request.query_params.items())    return templates.TemplateResponse("hello.html",         {"request": request, "params": params})

Since FastAPI defaults to JSON, we are setting an attribute "response_class" to make it clear that this endpoint returns HTML. This is not required actually, it's just a good practice so that our IDE is aware of the response type and can pick up errors if we try to return JSON instead.

Furthermore, it also makes it very clear what response type is expected from anyone reading the code. If you omit this attribute, FastAPI will automatically infer the type and nothing should break.

templates = Jinja2Templates(directory="templates")

We create a Jinja2 instance and point to the path where Jinja2 can find our app's HTML template files, since this is a local path within our project directory, we can simply pass in the name of the directory.

Finally, in our "hello_world" method we simply return a "templates.TemplateResponse" which takes two arguments:

The template name relative to the path: "./templates".
A dictionary with all the data we want our template to have access to.

Authentication

FastAPI supports adding middleware, thus you can easily implement any kind of authentication logic before serving the actual API request.

Here is an example of implementing token-based authentication:

from fastapi import FastAPI,Requestfrom fastapi.responses import HTMLResponse, JSONResponsefrom fastapi.templating import Jinja2Templatesapp = FastAPI()templates = Jinja2Templates(directory="templates")@app.middleware("http")async def check_api_key_header(request: Request, call_next):    api_key = request.headers.get("API-AUTH-TOKEN")    if api_key != 'Some Secret':        return JSONResponse(status_code=403,            content={"error": "Access Denied!"})    response = await call_next(request)    return response@app.get("/api/v1/hello")def hello_world(request: Request):    params = dict(request.query_params.items())    return templates.TemplateResponse("hello.html",        {"request": request, "params": params})

Naturally, for a real-world production app, you would want a more sophisticated authentication mechanism such as JWT tokens and querying a database to compare hashes, however, this technique should give you a solid starting point.

in this case, we are simply intercepting every request and checking if an auth token header is present and matches the hardcoded token, if there is no match we return an HTTP status 403.

If the token is valid, just continue with the request.

FastAPI vs Django Rest Framework

You cannot go wrong with DRF, it has everything you need to build just about any kind of API out of the box. Furthermore, DRF is battle-tested and mature - it's been around for ions and is well supported, with tons of documentation and community help.

DRF though, is more verbose and has a steeper learning curve. FastAPI on the other hand is so minimal that it's really easy to get up and running fast, even with a beginner-level understanding of Python and web technologies.

On the flip side, as an experienced developer; I love that the framework is lean and mean. I can bring in only the components I need, as opposed to the batteries included approach by DRF and Django.

This leads to better productivity, with less noise and more focus on just your application's business logic.

While performance will not be that much of a big deal for small to medium-sized applications, on larger applications with higher concurrent requests, FastAPI should outperform DRF by miles simply because of its asynchronous nature.

In summary, you should pick FastAPI over DRF for the following use cases:

You want better raw performance for high concurrent applications.
Microservices or standalone APIs.
Machine learning-related APIs where you need good asynchronous support.
Asynchronous functionality in general.
Easier learning curve and productivity.

And DRF:

You are more familiar with Django.
You want to leverage already available Django-backed goodness.
Your API is part of a bigger system, like a full-stack application that relies heavily on Django.
Larger community support.
Prefer not to configure DB stuff, and use the built-in ORM.

Is FastAPI production ready?

Yes absolutely, I am using FastAPI in production for various APIs and microservices. The biggest project I am currently running is a Semantic Search for an AI chatbot.

The Chatbot queries the API and does a similarity search. As you can imagine - this type of search is somewhat resource-intensive however I have served hundreds of thousands of requests without any trouble.

Apart from me, there are loads of big and small companies already using FastAPI in production, including huge brands like Microsoft and Netflix.

Django admin: How to add a custom page?

Kevin Naidoo — Wed, 13 Mar 2024 15:24:34 GMT

Django admin is a powerful tool to build admin panels rapidly. With just a few lines of code, you can have a fully functional admin panel in seconds.

The problem though is customization, one of the most common customizations you'll do often is add a custom Django admin page or section.

Naturally, in doing so, you would want your custom admin section to have the same look and feel as the "ModelAdmin" or other Django Admin-generated views.

One approach is to extend the Django admin base template and register a custom route in "urls.py":

{% extends 'admin/base.html' %}

The problem with this approach is that you will not have access to any variables set by Django Admin, thus you have to get that data and pass it to the template manually. Example: the sidebar links.

Setting up a custom admin site

class CustomAdminSite(admin.AdminSite):    def get_urls(self):        custom_urls = [            path('some-custom-url/', self.admin_view(SomeCustomView.as_view(admin=self))),        ]        admin_urls = super().get_urls()        return custom_urls + admin_urlssite = CustomAdminSite(name="my-fancy-url")admin.site = site# admin.register....your...modeladmin

The above will allow you to render your custom view at "/admin/some-custom-url".

Setting up a generic view

For our custom URL above we reference a class-based view: "SomeCustomView", here is an example implementation:

from django.shortcuts import renderfrom django.views.generic import ListViewclass SomeCustomView(ListView):    admin = {}    def get(self, request):        ctx = self.admin.each_context(request)        return render(request, 'somecustomview.html', ctx)

Setting up the HTML template

Finally, you simply need to extend the Django admin base template and replace the content section with whatever you want to display on your page:

{% extends 'admin/base.html' %}{% block content %}Hello World{% endblock %}

Conclusion

Wasn't that simple? Django admin is powerful yet cryptic sometimes, making it difficult to customize. However, once you learn the basics, Django Admin will make your life so much simpler.

Compared to other CRUD generators and admin tools, Django admin just feels clean and easy to use. Plus it will save you a ton of time.

How to build a PDF chatbot with Langchain 🦜🔗 and FAISS

Kevin Naidoo — Fri, 08 Mar 2024 05:50:58 GMT

While ChatGPT and other similar models are great and can give you relatively good information on any topic. A common problem is hallucination and verifying the source of the model's response.

To improve the accuracy and limit the scope of these LLMs to a specific domain, we can use a process called RAG.

In this guide, we will build a small document GPT and go over all of the essential concepts you need to understand.

What is RAG?

When you ask ChatGPT a question, it is drawing knowledge from its entire training dataset and therefore, there is no way to scope that data to have some control over its responses.

RAG or "Retrieval-Augmented Generation" is a machine learning process that allows you to feed large language models like ChatGPT or LLAMA2 with a custom dataset.

When the user prompts the model, you can then instruct the model to retrieve the answer from your custom dataset.

This then leads to better accuracy, and you can also pull in more up-to-date information unlike ChatGPT (the free version anyway) which is only giving you responses from training data that's a year or two old.

What are vector embeddings?

Generally in machine learning, we deal with vectors instead of actual text or words. Vector embeddings are a numerical representation of text.

Here is an example of how to generate vector embeddings:

from sentence_transformers import SentenceTransformersentences = ["This is an example sentence", "Each sentence is converted"]model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')embeddings = model.encode(["Hello World!"])print(embeddings)

If you run this code, you will get a large Numpy array of floats such as the following:

[[ 1.91737153e-02  2.87365261e-02 -1.23540871e-02  1.58221070e-02   7.90899321e-02 -9.76034254e-03  7.49560725e-03  5.52258380e-02   1.88754648e-02 -2.63798535e-02 -2.68068276e-02 -3.33473049e-02  -3.00286859e-02  3.89383249e-02  7.69484490e-02 -7.68074691e-02....

What's the point of vector embeddings? Since vectors are numbers, this allows us to perform various mathematical calculations to generate or search through text with meaning. This is actually at the core of sentence transformers which powers ChatGPT and other similar LLM models.

One common use case of vector embeddings is for search. The idea is to vectorize data and store these in a vector database such as "FAISS" or "Qdrant".

Instead of using a normal full-text search with keywords, you can now perform a similarity search using the KNN algorithm (K-Nearest Neighbor).

For example, in a regular keyword search using SOLR, if you searched for "Give me a list of iOS-based phones?". You probably won't get any good results because SOLR does not understand the meaning of words and is just looking for similarly spelled words and synonyms.

However, with a vector-based search, you can store meaning based on the location of each word in the context of a phrase or sentence.

Thus, this allows for complex searching capabilities using cosine similarity and other such mathematical algorithms, which in our example would associate "iOS-based" with the brand "Apple" and "phones" with the "iPhone" or "Smartphone".

What is FAISS?

FAISS; developed by Meta, is a library to store and search vector embeddings. Similar to how you would store documents in a keyword search engine like SOLR or Elasticsearch, FAISS allows you to store vector embeddings and provides neat Python bindings to perform similarity searches.

Why do we need FAISS? Most LLMs have a limitation on how many tokens or words your prompt can contain, thus if you have hundreds of large PDFs, it's almost impossible to give the LLM all this data as context in one go.

A better approach is to use FAISS to return only text that is relevant to the user's prompt. This reduces the amount of data you provide as context and will speed up your LLM queries in general.

What is Langchain?

Langchain is a Python library that provides various utilities to help you build applications with LLMs. Such utilities include simplifying generating vector embeddings, prompts, chunking text, formatting the LLM response, and more.

Furthermore, Langchain provides standardization for 3rd party vendors such as OpenAI or Mistral, giving you an almost plug-and-play architecture that needs minimal code changes when switching between models.

Getting our hands dirty with code

First off, you will need to install a few pip packages:

pip install openaipip install langchainpip install langchain-openaipip install PyPDF2pip install langchain-community# Replace "gpu" with "cpu" if you don't have a GPU.pip install faiss-gpu

You will notice 3 pip packages are being installed for Langchain. The "langchain-community" package allows third-party services like OpenAI, Mistral, and community developers to build integrations based on interfaces from the core package.

Next, let's build out a couple of functions to help parse PDFs and convert them to raw text.

from PyPDF2 import PdfReaderfrom langchain.embeddings import OpenAIEmbeddingsfrom langchain.text_splitter import CharacterTextSplitterfrom langchain.vectorstores import FAISS# Will house our FAISS vector storestore = None# Will convert text into vector embeddings using OpenAI.embeddings = OpenAIEmbeddings()def split_paragraphs(rawText):    text_splitter = CharacterTextSplitter(        separator="\n",        chunk_size=1000,        chunk_overlap=200,        length_function=len,        is_separator_regex=False,    )    return  text_splitter.split_text(rawText)def load_pdfs(pdfs):    text_chunks = []    for pdf in pdfs:        reader = PdfReader(pdf)        for page in reader.pages:            raw = page.extract_text()            chunks = split_paragraphs(raw)            text_chunks += chunks    return text_chunks

Splitting the text into chunks is necessary because when we do a similarity search, we match and return a much smaller amount of text in batches.

Furthermore, LLMs do have restrictions on the amount of tokens you can send per request, therefore chunking helps to mitigate the risk of overloading prompts.

Creating a store

Now that we have raw text from our PDFs, we can convert this text into vector embeddings and store them in our FAISS store.

Continuing from the script above:

def main():    list_of_pdfs = ["test1.pdf", "test2.pdf"]    text_chunks = load_pdfs(list_of_pdfs)    # Index the text chunks in our FAISS store.    # OpenAIEmbeddings will be used automatically to convert    # Each chunk of text into vector embeddings using    # OpenAI APIs.     store = FAISS.from_texts(text_chunks, embeddings)    # Write our index to disk.    store.save_local("./vectorstore")if __name__ == "__main__":    main()

Can you see the power of Langchain? Langchain automatically connects to OpenAI and does all the heavy lifting of encoding our text into vector embeddings and then storing them in the FAISS index.

Now for the fun part, chatting with our PDFs!

from langchain_community.chat_models import ChatOpenAIfrom langchain_community.vectorstores import FAISSfrom langchain_community.embeddings import OpenAIEmbeddingsfrom langchain.chains import RetrievalQA# Settings contains the env: OPENAI_API_KEYimport settings# Load the saved FAISS store from the disk.store = FAISS.load_local("vectorstore",  OpenAIEmbeddings(), allow_dangerous_deserialization=True)# Create an instance of a ChatGPT turbo modelllm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0)# Build our Langchain chain instance.chain = RetrievalQA.from_chain_type(   llm=llm,   retriever=store.as_retriever())# Ask the LLM a question.result = chain({"query": "what are exchange control?"})print(result)

The "RetrievalQA" instance performs a similarity search against our FAISS index and provides this as context to OpenAI.

To keep things simple, I am just using a basic prompt. You can however customize the prompt template. There is a wide variety of prompts that Langchain supports, a common use case is to give the chatbot a persona e.g. "You are an expert in Python programming".

Learn more about other prompt templates here.

Notice I am using "allow_dangerous_deserialization", this allows FAISS to load code from the vectorstore files on disk. Usually, you should not use this setting if data is coming from users. Instead, use a proper vector database to persist your document embeddings.

What about remembering previous conversations?

Users in general will ask related questions and it would be a pain to keep repeating the same messages. Luckily, Langchain makes LLM memory a breeze.

To store and provide the LLM with conversation history we can do the following:

from langchain.memory import ConversationBufferMemorymemory = ConversationBufferMemory()//conversations will be from memory or Redis.for msg in conversations:     memory.save_context(    {"input":msg['human_question']},    {"output":msg['chatbot_answer']})chain = RetrievalQA.from_chain_type(   llm=llm,   retriever=store.as_retriever(),   memory=memory)

How to host your own ChatGPT-like model?

Kevin Naidoo — Fri, 09 Feb 2024 13:35:21 GMT

Want to run a similar model like ChatGPT on your own infrastructure?

With a huge push to build open-source models, Mixtral is one of the best models available for free. Its also super efficient to run on low-spec hardware.

Although Mixtral is not as powerful as the chatGPT 3.5 and 4 models, it still is powerful enough for most generation tasks. I use Mixtral to classify products, label, and generate descriptions.

Setting up

You probably can get away with a decent-sized VPS or dedicated server, I suggest getting a GPU box. These can be expensive, however, there are companies like Hetzner where you can get a GPU box for under $150 pm.

First things first, you would want to set up the graphics drivers and CUDA. These instructions are for Ubuntu 22.04, and may or may not work with other Ubuntu versions.

sudo add-apt-repository ppa:graphics-drivers/ppa --yessudo apt update -ysudo apt-get install linux-headers-$(uname -r)sudo ubuntu-drivers install --gpgpuwget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get update -ysudo apt-get -y install cuda-toolkit-12-3

Install Ollama

Ollama is a powerful Golang library that can run large language models more efficiently. I have tested various ways of running models including llama.cpp, hugging face inference API, and various other tools. Ollama tends to perform the best with lower-spec hardware with a GPU.

If you are stuck on a CPU, llama.cpp may work better, but still, I managed to get Ollama working on a CPU just fine. I didnt do enough tests to conclude which is better for CPU-only machines, however, on a GPU box Ollama wins by a large margin.

To install Ollama:

curl https://ollama.ai/install.sh | sh

Now you should have Ollama installed, to set up Mixtral:

ollama pull mixtral:instruct

The above command will pull down the Mixtral model and configure it for you so that Ollama can run this model locally.

Running Mixtral

Now that you have successfully configured Mixtral with Ollama, running this model is as simple as:

ollama run mixtral:instruct

The command above will open a prompt shell, where you can prompt the model similar to chatting with ChatGPT.

This is great for local testing but not very useful for integrating with web applications or other external apps. Next, We will look at running Ollama as an API server to solve this very problem.

Running Ollama as an API Server

To run Ollama as an API server you can use systemd. Systemd is a Linux daemon that allows you to run and manage background tasks.

Here is an example of a Systemd config: mlapi.service

[Unit]Description=Ollama ServiceAfter=network-online.target[Service]ExecStart=/usr/local/bin/ollama serveUser=ollamauserGroup=ollamauserRestart=alwaysRestartSec=3Environment="OLLAMA_HOST=0.0.0.0:8000"[Install]WantedBy=default.target

In the above config, we run this process as ollamauser, this is just an isolated system user I created for security purposes.

You can run Ollama as any available user on your server, however, I would avoid running this process as root. Rather, create a new system user and keep the process as isolated as possible.

You will then need to place the config file in: /etc/systemd/system/

I called the file mlapi.service. You can name this whatever you like, just be aware when using the Systemd CLI tool, you need to reference this name as per the file name.

To enable your service:

systemctl enable mlapi.service

Now start your service as follows:

systemctl start mlapi.service

To check that the service is up and running, you can use:

systemctl status mlapi.service

Now that you are all set up, you can make an API call to the service as follows:

import requestsimport jsonurl = "http://127.0.0.1:8000/api/generate"payload = json.dumps({  "model": "mixtral:instruct",  "stream": False,  "prompt": "Designing Data-Intensive Applications By Martin Kleppmann",  "system": "Tag this book as one of the following: programming,cooking,fishing,young adult. Return only the tag exactly as per the tag list with no extra spaces or characters."})headers = {  'Content-Type': 'application/json'}response = requests.request("POST", url, headers=headers, data=payload)print(response.text)

Correctly so, the model returns programming as the tag. Ollama supports various options. You can get more detailed information about the options available here.

To get you started, here is a breakdown of the most common parameters:

model (required)Ollama can run multiple models from the same API, so we need to tell it which model to use.
stream (optional)Set this to false to return the whole models response. The default is to stream the response, so you will get a large JSON object with several child objects containing chunked phrases.
prompt (required)The actual chat prompt.
system (optional)Any context information you want to give the model before it processes your prompt.

Load balancing requests

While Systemd should restart the service if Ollama had to crash for some reason, its advisable not to make direct API requests to the Ollama service.

Instead, I suggest putting a load balancer in front and load balancing between multiple instances. You can easily achieve this using Nginx. Here is a load balancer example:

upstream backend {    server 192.168.1.1:8080 weight=1;    server 192.168.1.2:8080 weight=1;    server 192.168.1.3:8080 weight=2;    keepalive 200;}server {   server_name ollamaapi.example.com;   listen 443 ssl http2;   ... ssl and other configs here   location / {        proxy_set_header Host ollamaapi.example.com;        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;        proxy_pass http://backend;    }}

This is a basic example, you may need to adjust it to suit your environment but should give you a good starting point.

Since I covered this topic in detail in one of my earlier articles, I wont go into too much detail on this config, you can learn more about how Nginx load balancing works here.

The essential aspect here is the upstream backend and proxy_pass. For upstream backend we are simply creating a list of servers that can be used to route the request.

The proxy_pass directive just forwards the request to our upstream backend which then routes the request accordingly.

An alternative approach would be to use FastAPI, its fairly efficient, lightweight, and therefore wont add too much overhead to requests.

You can use the asyncio library to create a lock so that only one process can access the backend model at any given point in time.

Example:

from fastapi import FastAPI,Request,Dependsfrom asyncio import Lockapp = FastAPI()llm_lock = Lock()@app.post("/api/llm")async def query_llm(data: Prompt, lock: Lock = Depends(lambda: llm_lock)):    async with lock:        result = prompt_llm(data)        return result

SaaS starter kits to help you launch sooner.

Kevin Naidoo — Mon, 29 Jan 2024 22:00:00 GMT

Building your own SaaS can be a challenging experience, especially if you are self-funded and have a full-time job.

In this post, I am going to cover a few starter kits to help you ship your SaaS products much faster.

ShipFast (Commercial)

NextJS has exploded onto the scene in the past few years; while I prefer Django or Laravel, NextJS is a pretty decent platform to build your next SaaS app and seems to be the way most new startups are going (well according to X/Twitter anyway).

The pain I usually have with Node-based projects is that they don't come with "batteries included", so you have to spend a lot of time bringing in your packages.

Luckily, Shipfast is a great starting point with several essential features baked in that will save you hours if not days of dev time.

Compared to the other boilerplates below, it does lack some features like teams - nonetheless, it's well-priced, looks great, is minimal, and will still save you loads of time.

Nano Asp (Commercial)

It's not often that you find good Asp.net boilerplates. I guess it's because the whole dotnet ecosystem is largely used in enterprises and not startups.

Nonetheless, C# is one of my favorite languages and the .net stack is rock-solid, mature, and reliable.

Nano Asp simplifies working in a dotnet ecosystem, the template provides common useful SaaS features out of the box to help you speed up your development workflow. Comes with a very beautiful dark mode theme and a sleek interface that's smooth and clean.

Wave (Open Source)

One of the simplest templates around. They don't deviate too much from stock standard Laravel, so anyone familiar with Laravel will find this boilerplate fairly easy to customize.

Django Pegasus (Commercial)

Django is one of the best web frameworks out there, it's clean and easy to use - thanks to Python. Django Pegasus is a well-thought-out boilerplate with loads of features. Furthermore, It's easy to customize and doesn't deviate too far from the Django way of doing things.

Bullet train (Open Source)

Ruby is not one of my favorite languages. It's easy on the eyes but Rails introduces too much magic for my liking.

Nonetheless, I tried out Bullet Train and was quite impressed with the amount of features you get for free out of the box. Furthermore, Rails does make you super productive.

So if Ruby is your cup of tea, then you will love Bullet Train.

Gravity (Commercial)

Like Ruby, Node is not one of my favorite stacks either, however, this boilerplate is so sleek and feature-rich that it's hard to ignore.

Especially in the Node world where you waste too much time manually picking components of your stack. Gravity makes it super easy to launch sooner without mucking around with plumbing stuff.

What is a Mutex in Golang?

Kevin Naidoo — Mon, 15 Jan 2024 22:00:00 GMT

When you build Golang programs, you will almost always find a use for Goroutines.

Goroutines are powerful and generally easy to use, however, if you need to modify some piece of data that is shared between Goroutines - then you may run into some trouble with data integrity.

In this article, we will look at what is a "Mutex" and how to use it.

What is a Mutex?

In Golang; a Goroutine is essentially a function that is put in a background queue and executed concurrently whenever resources are available.

package mainimport (    "fmt"    "sync")var NUM_PROCESSED = 0func countNumProcessed(wg *sync.WaitGroup) {    defer wg.Done()    NUM_PROCESSED++}func main() {    var wg sync.WaitGroup    for i := 0; i < 500; i++ {        wg.Add(1)        go countNumProcessed(&wg)    }    wg.Wait()    fmt.Println(NUM_PROCESSED)}

If we remove the "go" keyword, each function call to "countNumProcessed" will block the loop and wait for this function to finish before continuing with the loop.

When you use the "go" keyword, these functions will be run concurrently. Which then makes it possible for 2 or more functions to modify the variable at the same time.

If you run this code a few times - you will notice that the total count fluctuates.

This is because each Goroutine, for a few nanoseconds copies the value of "NUM_PROCESSED" in memory to increment it and then update the variable.

If two (or more) Goroutines copy the value around the same time, they will not be aware of updates made by other Goroutines.

For example: let's assume the value was "200"; each of the two Goroutines will add "1" i.e. "201". Thus the value for "NUM_PROCESSED" will become "201" instead of "202".

This is where a Mutex comes in handy. A Mutex will create a "lock" in the process - so that only one Goroutine can update "NUM_PROCESSED" at a time.

The other Goroutines will then be paused until the lock is released. This is very similar to a queue - when each Goroutine releases the lock, the next one acquires a new lock and the process continues until all the queued Goroutines finish updating "NUM_PROCESSED".

A Mutex example

We can modify the above code to introduce a Mutex as follows:

package mainimport (    "fmt"    "sync")var NUM_PROCESSED = 0var MUTEX sync.Mutexfunc countNumProcessed(wg *sync.WaitGroup) {    defer wg.Done()    MUTEX.Lock()    NUM_PROCESSED++    MUTEX.Unlock()}func main() {    var wg sync.WaitGroup    for i := 0; i < 500; i++ {        wg.Add(1)        go countNumProcessed(&wg)    }    wg.Wait()    fmt.Println(NUM_PROCESSED)}

You will notice with the above code, that no matter how many times you run this function, it will always print "500". This is precisely the same amount of goroutines created in the for loop.

How to build your own SaaS business

Kevin Naidoo — Wed, 03 Jan 2024 22:00:00 GMT

I know from personal experience how difficult it is to start a SaaS (Software as a service) business, especially if you are a solo founder who is self-funded.

In this article, I am going to go through some of my learnings on how to build a SaaS business.

I am no expert by any means in SaaS businesses but I have more than a decade of experience in building web products to draw on, so hopefully this will help you avoid some of the mistakes I have made and help you flourish as a new tech founder.

B2B vs B2C: Which is better?

If you are still in the early phase of your SaaS journey, you may be wondering which type of customer base is best for you. There are 2 main types of customers:

B2B: These are business customers essentially.
B2C: The general public.

Most SaaS businesses go the B2B route because the general public usually doesn't like subscription services, they are accustomed to free stuff like YouTube, Facebook, Google, Gmail, etc...

Businesses on the other hand have bigger budgets and are more than happy to pay $30 or some sub if they are getting good value out of your product.

Thus, the CHURN (percentage of cancellations) is much lower in B2B compared to B2C.

B2C is a good model if you plan on offering a free service (freemium) and then using the volume to generate revenue from adverts. In addition, you can convert a percentage of that market to paying subscribers.

For example, a premium service upgrade will not show adverts, or you get more storage space or premium support, etc...

The danger of a freemium model is resource allocation and monthly recurring costs.

If your service is easy to maintain, for example, your running costs are $200 for up to 5000 customers and you are profitable with just 50-100 customers or advertising revenue, then this model can work well. Assuming that you don't have a high CHURN rate of course for the subscription model.

Usually, a B2B product with a trial seems to be the best model for most types of SaaS. With a trial, you only have to sustain the maintenance costs for a short period, and if your product is priced correctly, this cost will be easily absorbed by revenue from other customers.

Finally, with B2B, you probably can charge way more as well compared to a B2C product.

Build an audience

Before you even start writing a line of code, you should take some time to sit down and draw up an ICP i.e. an "Ideal Customer Profile" and a general business plan.

Most startups fail - this is the harsh reality of being an entrepreneur; which is why it's so important to first have a solid plan in place before actually building a product.

Once you have an ideal customer profile in mind (doesn't have to be perfect), you then can figure out how big your audience is and how to reach them.

To determine how big your audience is, you can:

Research your competitors. See how many customers they service and what their estimated revenue is. There are plenty of tools for this. e.g.: Similar web.
Look through forums and places such as Quora to see what customers are saying about your competitor's products. This will give you an insight into their pain points and will give you an idea of what key areas to focus on to build a better product.
Do keyword research using tools such as wordstream. You can find search volumes and other metrics to help you understand how big your audience is and what keywords to target.

Now that you have done some solid research, it's time to decide if there is sufficient market demand for the product you intend to build.

The best way to approach this is to find out where your customers hang out and build some kind of presence on that platform.

It's important to zone in on one or two platforms that you are comfortable with, be it YouTube, Quora, or Facebook - doesn't matter so much early on.

The goal is to pick a platform and stick with it. You need to be engaged for the long term. Don't just be involved in that community for the sake of advertising - you need to add value. Doing so will help you earn the respect and attention of the community you are targeting.

Once you achieve some sort of following, you can then start conversations with your audience and figure out if they would be interested in your product.

You should build a landing page or some sort of mailing list to capture leads at this stage.

💡If you want to ship your product to market fast and save hours or days of dev work, check out these paid and open-source: starter kits.

Django is great for SaaS

When building a tech product, as a developer myself, it's often very tempting to want to build everything out from scratch or use a shiny new framework.

This is a waste of time. The first iteration of your product should be a quick MVP. You should focus on shipping to the market as soon as possible (don't compromise on quality though).

Django is a solid choice for building SaaS products. Straight out of the box, it comes with a ton of features to help you build an MVP fast.

Furthermore, in addition to using a solid framework like Django, you should consider using a paid or open-source boilerplate, so that you can focus on your core products' business logic; instead of boring "plumbing" work such as building: auth, email workflows, subscriptions workflow and so forth.

SaaS Pegasus is a great clean and easy-to-use SaaS boilerplate that will speed up your dev cycle. It has a ton of features that nearly every SaaS business needs, saving you hours if not weeks of development time. [paid promo]

You can also check out an earlier article I wrote which goes more in-depth on why I think Django is the best choice for web backends here.

If you are not a Django developer or prefer some other platform, I did do an article on other options in this space, you can check that article out here.

Content marketing and keyword research

For new founders, you probably don't have a big enough audience yet and Google CPC advertising costs are always rising due to high competition. So how does one get noticed?

One of the most effective methods of getting consistent regular traffic is to blog, as a blogger myself, I know how difficult blogging can be, you could spend months or even years writing content but get very little traffic, and almost no conversions.

Writing effective search-friendly content is hard, and SEO gurus will sell you all kinds of techniques that may or may not work. End of the day, it's all about writing good quality content that has sufficient user interest.

Thus, before even writing an article, you should study market trends and look for low-competition keywords. You see, regardless of what the "gurus" tell you, Google ranks content based on 4 major factors:

Domain authority: How many other websites are pointing to your content, if you have a large percentage of "trustable" websites pointing to you, Google will naturally rank you higher.
Topic authority: Depending on the niche, you need to have several articles about a particular topic that go into great detail. Having just random topics all the time is not going to work well for you. Furthermore, good internal linking between articles will also help increase your topic authority even more. Furthermore, Having a LinkedIn profile link, social profile links, your photo, and a bio on your website will also help.
End the search journey: Your goal is to provide comprehensive quality content that will keep the user engaged for several minutes, and prevent them from hitting the back button to go back and pick another site from the search results.
Core web vitals: Google may still rank you high if the above 3 are of Good quality, however, having a well-optimized website that loads fast enough and follows best practices like compressing images and not spamming pages with 101 adverts, will help boost your ranking as well.

All the other stuff, about alt tags, meta tags, and so on are still important and useful but ultimately the above 4 items are key factors to getting you ranked higher on Google.

I am not affiliated with any of the below tools, but have personally tested them to help me rank my own content, thus giving you an idea from my own personal experience.

Domain authority takes time and is not easy to achieve in the short term, however, with low to medium-competition keywords you can still rank higher on Google by just doing the other 3 items mentioned above really well.

So instead of trying to compete directly with high domain authority sites for popular keywords, look for less competitive keywords that have a lower average domain authority score and build your content around these. This includes long tail keywords/phrases e.g. "Best website hosting" might be too competitive but "Best website hosting for Ruby on Rails" might be less competitive and easier to rank for.

To better understand your niche and how to find keywords with low competition but with a decent amount of traffic, I suggest using one of these keyword tools. You should look at "CPC", "Domain Authority" and "SEO difficulty" collectively to determine which topics to base your articles on:

Wordstream: https://www.wordstream.com/ Free but rather limited, not 100% sure how accurate it is, but essentially most tools are not 100% accurate anyway.
Ahrefs: https://ahrefs.com/ Probably the best tool on the market since most marketers I know prefer Ahrefs.
Semrush: https://www.semrush.com/ I prefer Semrush because it has a much cleaner interface than other tools, but has similar quality compared to Ahrefs.
Ubersuggest: https://app.neilpatel.com/ I don't like the interface, but in terms of pricing, they are the best-priced tool. Also, Neil Patel is behind this tool. If you don't know who he is - he's one of the most successful marketers around. I would check him out on YouTube and other platforms too.

Cold outreach

While an old marketing technique, this is still quite effective today. You can simply just scrape Google Maps, Linkedin, Reddit, and any other platform where your potential customers are and just reach out to them one by one.

Do not mass mail customers, it rarely works, even though it's not scalable I strongly suggest at least for your first hundred customers, sending a highly personalized mail.

Your mail should not try to sell them something, rather try to build a relationship and pique their interest first, giving them something for free (like an ebook) and then gradually educating them about your offerings.

Free tier

Always have some kind of free tier. This allows for a low barrier to entry. Unless you have a product that's mind-blowing and a no-brainer, it's hard to convert customers the first time they land on your landing page.

It's far easier to just onboard them on a free tier version and then gradually educate them via email on why they should buy your product.

There are multiple ways to do this. I often find having an open-source version or offering a free trial works best.

If the market is relatively large, you could also consider an always-free version of your product that has some limitations on usage.

Be careful of the "freemium" model though. You want to make sure that your pricing allows for sufficient padding so that you can cover the cost of the free tier clients. Usually, it's just better to offer a trial.

Expansion revenue

You are going to eventually plateau and onboarding new customers will become more and more difficult.

Expansion revenue gives you that extra revenue potential to upsell to existing clients, as it's far easier to convince an existing client to spend more than onboarding a completely new client.

What is expansion revenue?

Expansion revenue is simply an additional service such as an add-on that customers can bolt onto their existing subscription.

This usually entails adding extra user seats, purchasing SMS/Email credits, or in the example of a CRM, customers can pay for an extra "HR" module.

Lifetime deal

Subscription fatigue can become a real problem, however, offering a Lifetime deal is not necessarily a good solution.

A lifetime deal is essentially offering the customer to pay once for your product, and then have access "forever" without ever paying again.

This sounds great for customers, but from a business point of view, this is very dangerous if your product has monthly running costs per tenant.

A lifetime deal can work well for products like Desktop apps, Chrome extensions, or products that don't incur high recurring costs.

What are products that don't incur high recurring costs?

Let's think about a product that does incur monthly costs first: If you are running a B2B business that is a CRM, the system stores documents, sends emails and has a ton of other automation. You are therefore paying for each tenant and their costs could potentially rise over time, since they may use more disk space and automation services.

Thus the cost will grow for each new tenant you onboard and potentially increase for each existing tenant as their usage grows.

An online course, on the other hand, does not incur costs for each tenant. You just need hosting infrastructure and a good caching layer. You can probably host thousands of students on one VPS server, thus if you sell 5 courses a year this will cover your hosting costs for the entire year.

In this case, it's perfectly okay to sell a lifetime deal, and then build in some upgrade path.

With an upgrade path, you can convince the customer to convert to a subscription at some point in the future or buy more courses. This makes a Lifetime deal offer a great option and highly profitable too.

Here are some alternatives to Lifetime Deals:

Prepaid: Customers pay only for usage. Therefore those who use more will subsidize the cost of those who use fewer resources.
Yearly subscription: Offer a discount if the customer pays yearly.
Licensed deal: A customer pays for a license similar to Lifetime Deals, however, the license has an expiry e.g. 2 years or 5 years.

List your product everywhere

Content is king as you will know. Early on with a limited budget - it's difficult to drive traffic to your landing page or website.

Here are some free websites you can list your product on:

producthunt.com
alternativeto
Killer Startups
Beta List
Reddit - be careful here, Reddit users hate advertising.
Quora
AppSumo
Pitchwall
Pinterest
Betabound
StartupBase
Indie Hackers
Designer News
SaasSHub
Launching Next

Learn from the best founders

When you learn how to program, especially if you are like me and mostly self-taught, the best way to learn is to seek out experts in whatever language you are learning and follow them, buy their courses, etc...

In a similar vein, there are a ton of successful founders out there. If you want to become better and learn the best practices, then seeking out and learning from these founders is a great way to grow as a founder.

Here's my list of expert founders (in no particular order), I often follow:

TK Kader. TK has started up many companies and is a coaching specialist for SaaS founders. I find his 3 step breakdown of concepts quite refreshing and easy to follow.
Neil Patel. Neil is an expert marketer and usually covers some great tips on how to build content around your brand.
Simon Hoiberg. Simon has a very cool and fun way of explaining advanced business concepts to developers.
Rob Walling. Rob's YouTube channel does an excellent job of explaining all the core concepts you need to learn as a founder. He's also written several books on the subject and is involved in funding many new startups.
Adam Enfroy. Adam is one of the most successful creators around, not only does he run a highly profitable blog, but also runs a very successful YouTube channel that teaches you all about SEO and how to rank well on Google.

Do things that don't scale!

Earlier on, you probably don't want to spend too much on paid advertising because you are still figuring out your product market fit and messaging. Advertising via Google, Facebook, or other channels might become costly and yield very little results.

Instead, test your messaging by engaging with users on platforms such as Reddit, Hacker News, Product Hunt, Facebook, and so forth.

Not only will this help you to land that initial batch of 50-100 clients, but will also help to identify potential issues in your messaging, thus enabling you to improve upon your product and messaging.

Depending on the community, just be careful of "keyboard warriors". Some users (especially on Reddit) tend to be highly negative and will purposely try to shoot down your product because they hate advertising.

Therefore, aggregate your feedback from multiple sources and then look for common trends, and adjust your messaging or product accordingly.

Breath!

It is hard being a SaaS founder, especially when you work on a product for weeks, months, or even years and then go to market, and see little to no growth.

This can be mentally demotivating and exhausting, not to mention the constant anxiety of checking conversions and analytics.

Perhaps, you just need to be patient and set reasonable goals, then set up a checklist and incrementally each day do something constructive towards achieving that goal.

At the same time, just take a minute to appreciate how far you have come, as a founder, it's easy to fall into the trap of "obsession" where you spend too much time focusing on numbers and conversions and doubting yourself.

Sometimes, you just need to step away and take a breather, to gather your thoughts, and energy and re-focus.

You may need to pivot or give up on an idea entirely but don't be discouraged, keep going, re-evaluate what worked and what didn't, and adjust accordingly until you find that winning strategy.

Tech funding

Did you know that Microsoft funds small businesses? Getting this funding is easy if you have a solid product idea. You don't even need to build the product, they'll fund you even if you are still in the idea phase.

When I say funding, I am referring mostly to Azure, OpenAI, and other Microsoft licensing. It's not a direct momentary investment, and you do not have to give up a percentage of your company to Microsoft either.

The funding is essentially free access to Azure hosting services, free licenses for GitHub business accounts, access to industry experts to help guide you along, and much more.

You can learn more here: https://foundershub.startups.microsoft.com

Depending on your profile, you could potentially get up to $150,000 in credit to use in Azure and $2500 to use with OpenAI.

Accepting payments

There are various options for a SaaS founder to accept monthly recurring payments, one of the most common that you probably already know of is Stripe, however, if you plan to accept payments globally and not just in the US, I suggest looking at one of the following:

Paddle: Goes beyond just enabling subscriptions, they also take care of managing taxation and accounting so that you can safely handle payments in multiple different countries without much effort.
Lemon Squeezy: Another great provider, very similar to Paddle in that they will handle taxation and related accounting. They also include a mailing list feature, so you can manage email subscriptions and payments from one dashboard.
PayStack: The African version of Stripe basically.

If you are looking to use an off-the-shelf SaaS starter kit, Stripe is probably the best option for you. Apart from Laravel's Spark, most other SaaS starter kits seem to use Stripe, thus Paddle and Lemon Squeezy support in that regard is very limited.

Conclusion

Building a SaaS business is hard and many fail at first, including myself. Failure is okay, it makes you tougher and teaches you important lessons that will help you grow in the long run.

My goal with this article is to share some of the fundamental learnings from my journey to help you as a developer grow into this new space.

Happy building! And all the best for your next best idea.

Why Django is probably the best web framework

Kevin Naidoo — Sat, 02 Dec 2023 22:00:00 GMT

Most modern stacks are just fine for most projects, however, I have a special bias toward Django. It's stood the test of time and is rock solid with a ton of features baked in.

Ironically I do not use Django often but do have a lot of experience in it. My day job involves Golang and PHP-based frameworks like Laravel, which are fine and work well, however, every so often I get to go back to Django and it's always a great experience.

Why Django?

Since Django is built on Python, it inherits all the great things about Python. Python is one of the most loved languages around because it's easy to understand and has a great open-source community.

Furthermore, Django is mature and battle-tested - it's been around since 2003 or thereabout and has been used in projects of varying scales including high-traffic platforms like Instagram.

HMVT

Django uses something called "MVT", or "Model View Template". Essentially this is MVC because the "V" in MVT refers to views.py which is a controller - thus the "C" in "MVC". "T" in MVT refers to templates, which is essentially the "V" in MVC (Wow that's confusing right?).

Anyway, besides the weird acronym - this is a very clean and easy-to-use pattern, it ensures your code is separated into 3 layers that make sense for most web projects.

Django, however, takes it one step further with "apps". What I like to refer to as "Hierarchal Model View Template". HMVT separates your code into modules that are usually self-contained and portable.

This not only allows you to group related functionality into modules ("apps") but also allows sharing these modules between projects, thus DRY.

An example would be an "account" app, in the account app - you can put all logic relating to the user account such as:

Authentication.
User's profile CRUD.
2 factor autentication.

and so forth...

Next, you can create a blog app with this functionality:

CRUD for posts.
CRUD for taxonomy.
CRUD for tags.
Comments.

Besides the portability benefit, it also makes navigating through your codebase a lot simpler.

Batteries included

Django has a ton of functionality such as database management, CLI commands, emails, templating, a full-featured admin, and so on, already baked into the framework.

Furthermore, these are all maintained by Django and have excellent documentation. You do not need to worry about individual versioning as you would do in something like Next.js - where different projects could use a whole host of different packages.

Then, there is Python, which has a package for everything.

Django Admin

If you have ever needed to build internal tooling at your company or just need a simple CRUD for your frontend - you would understand the pain of CRUD, authentication, and all the plumbing stuff.

Django Admin is incredibly powerful and simple to implement, you extend a base class and can have a working crud in minutes.

Example:

from blog.models import Postadmin.site.register(Post) # Run this in your terminal to create an admin userpython3 manage.py createsuperuser# Run this to start Django's test serverpython3 manage.py runserver

Next, simply visit the following URL to access your admin dashboard:

http://127.0.0.1:8000/admin

Customizing the admin dashboard is a bit of a learning curve when you want to do something more advanced, but for the amount of functionality you get out of the box - this is well worth your time learning. You will end up saving hours if not days of work in the long run.

Types

I see this complaint a lot on the internet, that Python is not a typed language and types just make your code more robust and safer.

Yes, strongly typed languages such as C#, Golang, and so forth do give you that added benefit of having a compiler check your code and catch silly errors. This is nice and goes a long way in helping you write better code.

Types though aren't everything. Both Python and Django do provide some great functionality to help mitigate some of the risks with a dynamic language:

In models and forms, each field is essentially typed and the model/form itself is a type. Besides making code completion work a lot better, this also forces you to write code mindful of types.
Python does support type hinting:

def addNumbers(num1: int, num2:int) -> int:    return num1+num2

Unit testing - Django has excellent tooling to help with unit tests. Types alone don't catch everything - a unit test on the other hand provides a much more robust way of testing code to ensure code quality.

Django templates

Django templates are easy to understand and work with, they use essentially a simplified version of Python. You don't have to worry about state, props, and all the complexities you would find in a UI framework like React.

With templates alone, you can get quite far, and then just use a sprinkle of standard JavaScript where you need a bit of interactivity.

Furthermore, you can extend Django templates with libraries like Django Unicorn, HTMX, or alpine.js.

Some examples:

# Simple IF statement{% if form.errors %}  Oops, please fill in all required fields.
{% endif %}# Simple for loop {% for user in users %}     {{ user.name }}
 {% endfor %}

Django is dependable

Early on in my career - I wrote Django often, almost daily till about 2015/2016, and then haven't touched it much since. In 2022, I started to refresh my memory again - to my surprise - not much has changed.

Django in 2015 was around version 1.9 and today it's at version 4. Furthermore, I was still on Python 2.7 but now I am running Python 3.10.

Many of the packages shipped with Django or part of the Django core have had several under-the-hood updates in the last 8-10 years as well.

Yet despite all these changes, the knowledge I gained in 2015 and earlier, about models, views, templates, and everything else Django-related - still looks and works 90% the same.

There are a few minor tweaks here and there and some new stuff but overall, I haven't encountered any major breakages while upgrading to Django 4+. For a Python project, this is incredible (Python has a reputation for breaking backwards compatibility).

Compare this to a 2018 React project, in 2023 - so much has changed, when I tried to upgrade - several things broke. It was painful and even worse - there were JavaScript package issues besides React itself, not to mention a whole new coding style with hooks.

I love evolution in tech and new tools as much as anyone else, however, Django tends to carefully take its time when making changes to the core library.

This keeps the framework stable and allows developers time to gradually update their skill set. In terms of business - as a member of senior staff in most teams I work with, it's much easier to sell Django, compared to other flavor-of-the-month projects.

Businesses don't care about the latest and greatest, they want to invest in stacks that will be easy to maintain and have a solid proven track record. They don't want to keep upgrading the framework every year.

This is why Django is always a sold choice.

Documentation

The Django documentation is well-written and covers the entire framework with great detail. Just reading the docs alone is sufficient to become productive in Django without any additional tutorials.

Furthermore, it's all in one place - since Django is batteries included, for 80% of the time you will be using standard libraries. When you get stuck, it's quite easy to find documentation in the Django docs.

If all else fails, Django has a large community on stack overflow and other popular platforms.

How to get started with Django?

As mentioned above - Django is MVT-based. The core concepts to learn:

Each MVT app in Django has a similar structure:

blog/urls.py - routes
blog/views.py - controller
blog/models.py - all db table models
blog/admin.py - admin dashboard logic
templates/ - normally at the project root, but it's possible to also add templates to each app.

Routes (urls.py):

from django.urls import include, pathfrom blog import viewsurlpatterns = [    path("blog/", views.list, name="blog-index"),    path("posts/new", views.create_post, name="blog-create-post"),    path("posts/edit//", views.edit_post, name="blog-edit-post")]

Models (models.py)

from django.db import modelsclass Post(models.Model):    title = models.CharField(max_length=255)    body = models.TextField()

Views (views.py)

from django.shortcuts import renderfrom blog.models import Postdef list(request):    posts = Post.objects.all()    context = {        "posts": posts,    }    return render(request, "blog/index.html", context)

Templates (templates/blog/)

{% for post in posts %}   "{% url 'blog-posts-view' post.pk %}">     {{post.title }}
   {{ post.body}}
{% endfor %}

Model queries

# Saving a modelp = Post(title="Django Rocks", body="Django is an amazing framework.")p.save()# Get by IDpost = Post.objects.get(pk=1)# Get by titlepost = Post.objects.get(title="Django Rocks")# Update a postpost.body = "Some new description"post.save()# Find posts that start withposts = Post.objects.filter(title__startswith="Django")

Learn more about queries here: https://docs.djangoproject.com/en/4.2/topics/db/queries/

Conclusion

Django is an amazing framework, it has everything you need to build any kind of web application.

Pair Django with HTMX or Django Unicorn, and you have a full-stack beast that can give your users a great user experience whilst still being able to handle complex engineering tasks on the server side.

While Django scales up pretty easily to support large enterprises, it is perfect for small teams, freelancers, and even solo developers.

The development speed, simplicity, and awesomeness of Python will make you super productive and ship to market much faster.

Nginx load balancing for your Linux servers

Kevin Naidoo — Thu, 30 Nov 2023 07:02:08 GMT

Nginx is one of the most popular web servers ever; it's fast and relatively easy to customize. You can use it to serve static assets, HTML content, or proxy requests to an application service like Gunicorn or PHP-FPM.

Did you know you can also use Nginx as a load balancer?

In the days of Apache, before Nginx became a thing - we relied on software such as HaProxy. Actually I still kinda use HaProxy in some places, however, it can be a little weird to configure if you are not familiar with the syntax.

Setting up backends

In Nginx - you can set up a list of origin servers to forward requests to by declaring a backend type before your "server" directives.

upstream backend {    server 192.168.1.1 weight=1;    server 192.168.1.2 weight=1;    server 192.168.1.3 weight=2;    keepalive 200;}

Each server IP should now point to your application servers.

The "weight" will affect balancing, a higher weight means more requests are sent to that server. For example - for every 4 requests, 2 requests will be sent to 192.168.1.3 and the remaining servers will each get one request.

"keepalive" - allows Nginx to keep a connection cache to the origin servers so that you can reuse connections for more than one request.

This improves the efficiency of Nginx to handle large volumes of requests in a short space of time because it doesn't need to keep spawning new connections to the origin servers.

In this case, we tell Nginx to keep 200 connections in the cache. When this limit is reached, Nginx will close the oldest connection in the cache.

You can even configure backup servers. These won't receive traffic normally, however, if your origin servers are down or cannot handle the load, Nginx will automatically route to these backup servers. An example: "server 192.168.1.4 backup".

Configure proxy pass

Now that you have an upstream backend configured, all you need to do is just point your "proxy pass" to this backend.

server {   server_name mywebsite.com;   listen 443 ssl http2;   ... ssl config here   location / {        # A very basic proxy example. You probably would need        # - proxy_send_timeout and other proxy settings.        proxy_set_header Host www.mywebsite.com;        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;        proxy_pass http://backend;    }}

As you can see above, "backend" - is essentially the name of our upstream directive declared earlier. Nginx will now forward all requests from "/" to our load balancer, which will then distribute the traffic using a round-robin algorithm.

When a request reaches the origin servers, the IP address of that request now contains the load balancer's IP and not the actual client's IP.

This is not ideal, in most cases you would want to know the user's real IP address, hence why we add "X-Forwarded-For".

Conclusion

This is just the tip of the iceberg, but hopefully, this will help you better understand load balancing and provide a simple solution to get started with running multiple app servers and distributing traffic across these servers in an efficient manner.

How to use Redis as a search database

Kevin Naidoo — Fri, 24 Nov 2023 22:00:00 GMT

Did you know that Redis can be used as a proper full database to store and query JSON efficiently?

Redis has full support for JSON storage, it can even store vectors for semantic searching with machine learning.

In this article - I will cover some of the core concepts you will need to know, to supercharge your regular old Redis cache servers into efficient databases for storing and searching JSON.

Why Redis?

Both SOLR and Elastic Search are powerful and efficient DB stores for search-related tasks, however, both are JAVA-based and will naturally consume more memory, and be a tad bit harder to deploy and maintain.

Redis on the other hand is fairly lightweight, and much easier to install and manage too. Furthermore, you still get the same performance benefits as you would using Redis as a cache store.

Setting up

The Redis you install via your package manager is in-memory only by default. Depending on your settings - Redis is going to delete old data to make way for new entries so that it efficiently manages memory usage.

This behavior works well for caching, however, is not going to work for our use case, as we want to persist this data just like with MySQL or any other DB store.

To do so, you will need to install extra modules for Redis. You can find installation documentation for your operating system here.

Building an index

Redis has various index types in which you can store data. For this article - we will focus on the JSON storage type because this will allow us to store more complex data than you could in other types such as HASHs.

Redis provides an ORM package we can use to create our index, load data, and query the index. To get started - you will need to install a few pip packages:

# At the time of this article - the default package# - had issues in my setup on Python 3.9# - so I'm forcing 4.6.0pip install redis==4.6.0pip install redis-om# This will come in handy later to load data.pip install faker

Next, let's create a models.py

from redis_om import JsonModel, Field,Migrator,get_redis_connectionclass Book(JsonModel):    book_title: str = Field(full_text_search=True)    description: str = Field()    isbn: str = Field(index=True)    price: float = Field(index=True)    category: str = Field(index=True)    def migrate():        Migrator().run()    class Meta:        index_name = "books"        database = get_redis_connection(port=6379, host="127.0.0.1")

If you ever worked with Django - this will look very familiar. The above class essentially maps to an index in Redis, similar to a database table in MySQL.

We provide a data type for each field "str", "float" etc... and tell the model which fields should contain an index in our schema so that they can be optimized for querying.

full_text_search - as the name would suggest, simply allows us to efficiently perform "LIKE" queries on this field.

The Meta declaration is optional. Both database and index_name will default to localhost and an auto-generated index name respectively.

Loading data into our index

To load data - you simply have to create an instance of the "Book" model and call the "save" method.

To keep this code clean and readable - I omitted error handling, however, in a real-world scenario, wrap the instantiation and save it inside a try-catch block. The model will throw an exception if there's bad data or the save fails.

from faker import Fakerfrom models import Book# You only need to run this whenever there's# - changes to the modelBook.migrate()faker = Faker()books = []for i in range(100):    book = Book(        book_title=faker.sentence(),        description=faker.text(),        price=faker.random_int(39, 200),        isbn=faker.uuid4(),        category=faker.word()    )    book.save()

Once this script finishes, you should now have 100 rows of data in Redis.

I use a handy desktop client to view data in the Redis backend - it's called "Redis Insight"

Searching the index

Redis-om provides a very Pythonic way of searching through your data, thus no need to mess around with using the standard weird Redis syntax - you would normally use in the Redis CLI client.

Here are some common search queries:

from models import Book# Where book_title LIKE billionsfor b in Book.find(Book.book_title % "billion"):    print(b.book_title)print (">>>>>>>>")for b in Book.find(Book.price > 100):    print(f"Book {b.book_title}, Price: {b.price}")print (">>>>>>>>")for b in Book.find(Book.category == "management"):    print(f"Book {b.book_title}, Price: {b.price}")print (">>>>>>>>")print(    Book.get(pk="01HG2QBM88936CYTEXHBQJX83J").book_title)print (">>>>>>>>")for b in Book.find(    ((Book.price > 110) & (Book.price < 150))    | (Book.price == 84)):    print(f"Book {b.book_title}, Price: {b.price}")

Conclusion

Redis is by far the most efficient cache database around in my opinion. Using Redis stack server - we can now take advantage of that raw performance, and go beyond just caching to improve the snappiness of our apps.

Relational databases such as MySQL scale well and offer the best data integrity for most use cases. Complex querying with Joins, Subqueries, and so forth, is much easier to work with in relational databases, some complex queries have no equivalent in Redis. Essentially, you should use relational databases as your primary application database, especially for deep relational data.

When it comes to full-text searching though, you have to scan millions of rows to find a match, this is where relational databases tend to fall short. Since Redis is primarily memory-based and has efficient data-scanning algorithms - it can search and retrieve data much faster with less strain on your hardware.

Ultimately, I would use a combination of MySQL and Redis, MySQL for the primary application data. Redis for caching, searching, and maybe even log data or time series data.

Full stack monolith experiment with Golang

Kevin Naidoo — Wed, 22 Nov 2023 22:00:00 GMT

I recently built a project called "Scriptables" to help me deploy servers fast (an open-source Laravel forge alternative) and prevent me from doing the same mundane stuff over and over again.

It's not the most comprehensive server builder around (well yet anyway), but a good enough starting point, an MVP that I can use to quickly spin up Ubuntu servers. I mostly use the tool to spin up LAMP-type servers but have also spun up some Django servers with it as well.

I have never used Golang for a web app, my general use cases for Golang are small to medium background jobs, terminal tools, and scraping, thus this was an interesting experiment.

Why a Laravel forge alternative anyway?

Laravel Forge is a handy platform no doubt, and does a whole lot more than Scriptables, for sure.

Still, it's some third-party platform having access to my SSH keys and servers. I am sure they take every measure possible to protect their client's data, but still, I'm more comfortable having my instance locked behind a VPN or server that's not publicly available.

Furthermore, I am okay with writing BASH. Scriptables is essentially made up of a bunch of modular BASH scripts, it's fairly easy to customize and build any kind of server you like, including Django app servers, Next.js app servers, and various other types.

Finally, the GUI gives me a lot more flexibility to build out functionality that I need vs using an off-the-shelf product like Laravel Forge.

Why Golang for a webapp?

Minimalism, web frameworks in general are very bloated and require a ton of dependencies. I like the Golang ecosystem, you don't need too many third-party packages, and the language itself is super compact and easy to write.

It feels poetic to me, especially using structs as types. They are lean, plus you get nice code formatting and the IntelliSense in vscode is really amazing. I feel so productive, compared to Laravel where I have to jump around the filesystem, use composer packages, and have to deal with all the weird quirks of PHP.

Apart from the awesome developer experience, the whole Golang ecosystem is really efficient and fast, you literally run "go build" and get a single binary that can be run anywhere without the need for external dependencies, you don't even need Golang installed in production.

Pulling in just the packages I need

While the standard library took me quite far, there came a time when I needed a little bit more.

The MySQL adapter was okay but a nice ORM would be great, this is when I reached for GORM - it's such a beautiful and easy package to work with.

You can combine structs with GORM, similar to Laravel's Eloquent. Further, you can also run raw SQL - so kinda like the best of both worlds.

Here's an example:

var book Bookdb.First(&book)# Wheredb.Where("isbn = ?", "xyz").First(&book)# Raw Queryvar totalSales int64db.Raw(`SELECT SUM(price) FROM orders WHERE category_id = ?`,5).Scan(&totalSales)

Next, I added more and more routes, it was becoming a bit cumbersome - so I reached for GIN. GIN is a very lightweight web framework for Golang. In GIN you can declare routes similar to the below:

router.POST("/server/firewall/delete/rule", controller.DeleteFirewallRule)router.POST("/server/firewall/add/rule", controller.AddFirewallRule)router.GET("/sshkeys", controller.SshKeys)

Apart from these 2, and db drivers - there are not too many other packages that I have used.

What features does my app have?

It took me roughly 3 months, 1 to 3 days a week to work on this project whenever I had a bit of free time, which when I look back really shows how productive Golang can be. Here are the features I built:

Full authentication: login, logout, password reset, detect and ban too many failed attempts.
Teams functionality: invite and manage other teammates.
Email functionality: sends out registration emails, forgot password, and so on.
SSH jobs: jobs that will provision servers including setting up SSH keys, MySQL, Laravel, PHP, and so forth.
CRON job management: Set up and sync CRONs from the GUI.
Firewall management: add and delete firewall rules from the GUI.
Extended GIN templates to use a jinja2 package.
2Factor auth - I used a 3rd party library but still needed to implement the flow.

... And the list goes on.

Conclusion

Golang is quite capable of building a full monolith backend, with just a handful of hand-picked packages - you can reach the same level of productivity you would achieve in any of the major frameworks.

Go's not perfect though, there are some areas it falls short in and you have to implement your own solution:

Laravel and Django templates are awesome to work with. There's nothing as feature-complete as these two in Golang. Nowadays, most are using a frontend framework like React or Vue - so this is probably not a major issue.
There are no established guidelines. In Django and Laravel, every project has the same sort of structure. Golang on the other hand can vary depending on the team, developer, or company.
Some plumbing stuff would be great. Implementing CSRF in GIN was a bit annoying, this should just come standard. Auth scaffolding, forgot password - that kind of thing would be nice too.

Overall I had fun experimenting with Golang, not sure if I will use it again for bigger projects but for small to medium web apps - why not?

Linux commands you should know

Kevin Naidoo — Mon, 20 Nov 2023 22:00:00 GMT

I use open-source tooling which predominantly runs on Linux servers, therefore often I find myself in the terminal running commands to do various tasks.

In this guide - I'm going to give you a cheat sheet(sort of) of some useful and interesting commands I use often.

Look inside a process

Have a running process and no idea what it's doing?

Using "strace", you can easily peek inside a running process and output its log info using the below command:

sudo strace -p 3612485 -e write

Stop and move a process to the background

# Press ctrl+z and type the following:bg

Bad internet?

Want to run a command in a remote terminal but scared that a disconnect might stop the process prematurely?

There are two options for this, "nohup" or "screen". Screen opens the terminal in a window so even if your network disconnects - the screen will still persist.

The "nohup" command does something similar except it keeps the process running even if your network disconnects and writes to a log file.

nohup python run_my_awesome_script.py &

We use the "&" to push the process to the background immediately so that we can continue working in the terminal. Omitting this will basically lock your terminal on that command until it finishes.

To use screens:

screen -S MyScreen

To exit and keep the screen open - enter "ctrl+alt+d". If you want to go back to the screen:

# ScreenNameHere - optional - if you have only one open screen.screen -r ScreenNameHere

What is running on port xyz?

lsof -i :5000# Ornetstat -tupln | grep :5000# Or - this one may not always give you the best results# - as it also searches process IDsps aux | grep 5000

Will generate a list of processes that are using the specified port.

Sync and copy files

Rsync is a very powerful tool for syncing files incrementally, I won't go too in-depth on what the flags mean - please check the manual, however, I will cover some common use cases for rsync.

Find and copy log files that are older than 180 minutes to a backup directory.

find /var/log/nginx -type f -mmin +180 -exec rsync -av {} /backups/nginx/logs \;

Sync all files in directory /var/lib/mysql to a remote server. Set the SSH port to "9023"

rsync -rav -e 'ssh -p 9023' --progress /var/lib/mysql/ root@192.168.1.1:/var/lib/mysql2/

Simple copy - if you just want to copy files between two machines and don't care for incremental syncs.

scp -P9023 -r somefiles/ root@192.168.1.1:/tmp/

Compress and decompress files

 tar -czvf somestuff.tar.gz somestuff/# And unpacktar -xzvf somestuff.tar.gz# Compress a single filegzip -9 data.xml# Uncompress single filegzip -d data.xml.gz

Metrics

# Open system monitor toolhtop# View memory usagefree -m# View processesps aux# View CPU informationcat /proc/cpuinfo# Disk space usagedf -h# File/directory sizes in the current directorydu -h# To 20 biggest files & folders in directorydu -h | sort -rh | head -n 20

Conclusion

These just scratch the surface of useful Linux commands that you will use often.

I didn't cover "grep", "awk", "sed" and so on, however, I do have an earlier article that does cover some of these - which you can check out here.

Hopefully, this is useful to you, please feel free to comment down below if you have any other suggestions or would like a future article on a specific area you would like me to cover.

How to install NVIDIA drivers for machine learning on Ubuntu

Kevin Naidoo — Fri, 17 Nov 2023 22:00:00 GMT

A common pain point for setting up servers to run AI models - is getting the NVIDIA drivers to work correctly with Pytorch and other machine-learning libraries.

In this guide, I will walk you through some installation steps you need to run to get your GPU working correctly with your AI models.

I am running Ubuntu 22.04. If you are running a different version - you may need to tweak the CUDA toolkit version to suit your Distro.

Install docker and some essential apt packages

sudo apt install apt-transport-https ca-certificates curl software-properties-common -ysudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -sudo add-apt-repository \"deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" -ysudo apt install docker-ce -y# Now add your user to the docker group# You will need to logout and back in again - for this to take effectsudo groupadd dockersudo usermod -aG docker yourusername

Setup NVIDIA GPU drivers

sudo add-apt-repository ppa:graphics-drivers/ppa --yessudo apt update -ysudo apt-get install linux-headers-$(uname -r)sudo ubuntu-drivers install --gpgpu

Setup CUDA

sudo apt-get update -y# You can find the right key to use for your distro here:# https://developer.download.nvidia.com/compute/cuda/repos/wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get -y install cuda-toolkit-12-

Configure docker to use the GPU and Cuda

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpgcurl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.listsudo apt-get update -ysudo apt-get install -y nvidia-container-toolkitsudo nvidia-ctk runtime configure --runtime=docker

Conclusion

GPU setups can be tricky and painful, hopefully, this goes a long way in getting you up and running.

Now you should be able to run any of your Pytorch or machine learning models on the GPU, either natively on the machine or using docker.

Why ChatGPT and other LLMs are overrated and won't take your job

Kevin Naidoo — Thu, 16 Nov 2023 22:00:00 GMT

Since the end of 2022, and for most of 2023 - ChatGPT, machine learning, and AI have become hot topics. Every content creator on the internet, including individuals, programmers, and so on, has been incorporating this technology into their workflow in some shape or form (including myself).

So, why are LLMs overrated?

My title is a bit misleading, yes of course LLMs will continue to be popular and grow, now and into the future, and yes they may replace a certain group of jobs, however, the technology behind these LLMs is nothing new and has been around for years.

LLMs simply digest large amounts of data and then use prediction algorithms to figure out what the next word or sentence should be.

There is no intelligent thinking behind the response you get from an LLM, it's just statistically generated content, and thus has a ceiling for which it will reach in the next few years.

Don't get me wrong this tech is useful, for generative and analytical tasks such as generating a blueprint for a blog article or some code or a script for a YouTube video or images, etc...

However, based on the current technology, it's probably going to be integrated into chat clients like WhatsApp and mobile devices as a personal assistant and, therefore will become Google++ essentially.

Instead of searching for content or reading through 100s of pages of documentation, you will have a personal assistant on hand to quickly find and summarize data, assist you with compiling presentations, generate images, etc...

Merely another hammer

WhatsApp is a great example, it's used by billions of users worldwide, and is a handy tool in your pocket, to stay in touch with friends and family, but also helps you with everyday business or work-related tasks.

The same applies to LLMs, eventually, the hype will die down and it'll become another tool in our pocket.

It's cool because it's "new" to most people, just like when the first iPhones were released - people braved crazy queues just to get their hands on the new version.

Several years later, there's nothing special about the new iPhones - no new innovation, just incremental updates.

This is the fate of ChatGPT and LLMs too.

Regulation

Lawmakers as usual are slow to react, it will take several years - maybe even decades to properly regular AI, however copyright claims, lawsuits, and heavy regulations will stunt the growth of AI, and this is inevitable.

Ultimately, besides the expensive infrastructure costs associated with running these models, managing the security and regulatory requirements around such technologies will become too much of a burden for small players, and thus big gatekeepers will dominate.

Look at the AI landscape already: OpenAI, Microsoft, Google, and Facebook seem to be prime candidates to dominate this space and become the gatekeepers.

Once the technology is sufficiently commercialized and profitable, its growth will stabilize and gatekeepers are not going to want to tinker too much because they will be at risk of losing money, unless of course such tinkering results in further market enthusiasm & excitement.

A good example of this is Google, they've been way ahead of the AI game for years and could have trained a model similar to ChatGPT much sooner. Yet, they did not bother with this market, because - it threatened their cash cow advertising business. It's only now because of competition - they were forced to release BARD and play in this space.

What exactly am I saying?

Well, to summarize, LLMs will continue to grow in the next 2-3 years but it'll plateau and, the rate of innovation will slow down. The technology will reach its peak and will become like any other tool we use on a daily basis.

This will not end skilled careers such as graphic designers, programmers, content writers, etc... - in fact, it's going to make these professionals even more productive.

However, the downside, is that there will be some damage. Instead of 3 to 4 engineers - you could hire one or two engineers and use the LLM to automate all the mundane stuff.

Similar applies to graphic designers, instead of hiring a graphic designer for basic tasks - you could use something like Canva+AI to achieve the same result.

Conclusion

LLMs take the work of creators, stuff that exists on GitHub and other open platforms, digest that information, and then re-use that data to generate content. These models do not create their own content, they just re-arrange pre-existing content and regurgitate that content in a format we find useful, using mathematical algorithms.

The human brain, on the other hand, is remarkable, we can be put in situations we've never experienced before and thrive like no other creature on this planet can.

Besides being able to absorb, digest, and make predictions on data - we can create stuff from nothing, and invent new tools, and new ideas.

A set of identical twins, with the same parents, same childhood, same education, and even the same diet can experience an event, or ingest data from the same source but create totally different experiences and outcomes.

This is the power of the human brain, we can adapt and innovate like no other species.

LLMs at this point are nowhere close to this ability, and most likely never will be.

You are unique, and will always be valuable no matter how good ChatGPT or any other LLM gets, however, don't become complacent - keep learning and keep evolving.

How to create a Google Chrome extension

Kevin Naidoo — Wed, 15 Nov 2023 22:00:00 GMT

I don't write Chrome extensions often, it's usually once in a couple of years, and every time I come back to building one - it's really painful to remember how to go about using the API.

Google documentation is comprehensive, however, I just don't have the attention span to sift through so many pages, a sea of documentation that's extremely verbose, and sometimes confusing without a strong cup of coffee.

Anyway, getting back to the point of this article - I will run through a basic extension to help save you an hour or two of pulling out your hair.

The manifest file

First things first, you are going to need a settings file called "manifest.json" - which lists and controls permissions, where your files are, and so forth.

{  "name": "My Fancy Extension",  "version": "0.0.1",  "description": "Something cool is coming",  "permissions": [    "contextMenus",    "tabs"  ],  "icons": {    "16": "icons/app-icon.png",    "48": "icons/app-icon48px.png",    "128": "icons/app-icon128px.png"  },  "background": {    "service_worker": "service_worker.js"  },  "host_permissions": [    "https://*/*",    "http://*/*"  ],  "manifest_version": 3}

We use the "contextMenus" permission to create menu items on the mouse right-click menu. The "tabs" permission is used to open a new tab.

The background service worker

This is essentially our application main and will contain all the code needed to install and use our extension.

You can name the file whatever you want to, so long as the name in your manifest matches the file name.

// First we create a right-click context menuchrome.runtime.onInstalled.addListener(() => {  chrome.contextMenus.create({    id: "SearchGoogle",    title: "Search Google for something",    contexts: ["all"]  });});// Next respond to the event when the above is clicked.chrome.contextMenus.onClicked.addListener(event => {// Notice "SearchGoogle" matches the context menu ID above.  if (event.menuItemId == "SearchGoogle") {       // Similar to regular events in the browser       // - you can access the event object and grab       // - the text the user has highlighted       const text = event.selectionText;       const baseUrl = "https://www.google.com/search?q="       // Next - create a new tab and open the URL.       chrome.tabs.create({url: baseUrl + text});   }});

Conclusion

That's really all there is to creating a basic extension. Even though this is a very simple implementation, it's quite powerful and you can get a lot of mileage from just adding menu items to the context menu.

To keep things simple, I did not add validation or separate out the logic into functions, however, you get the idea and can use this as a starting point to expand upon.

Happy extension building, check back soon - I'll be covering a more advanced extension with a setup wizard and popup window.

6 useful SQL queries for web developers (MySQL)

Kevin Naidoo — Sun, 15 Oct 2023 22:00:00 GMT

As a web developer, SQL is a very integral part of my day-to-day development workflow. Even when I use ORM or NoSQL - I can't imagine an extended period of time where I am not writing SQL.

In this guide - I will cover some of the basic SQL statements you will write often as a web developer.

1 - Create tables

CREATE TABLE users(   id int(11) unsigned PRIMARY KEY AUTO_INCREMENT,   name VARCHAR(155) NOT NULL,   email VARCHAR(155) NOT NULL,   created_at datetime DEFAULT NULL,   verified tinyint(1) DEFAULT NULL,   KEY `user_email` (`email`));

You can also run the following query to get the create statement for any table:

SHOW CREATE TABLE users;

2 - GROUP BY

I skipped past SELECT statements as I'm assuming you would already know this. In this statement, we basically want to get a count of posts written by each author.

SELECT author_id, count(id) as total FROM posts GROUP BY author_id

This should return:

+-----------+-----------+| author_id | total     |+-----------+-----------+|         1 |         6 ||         2 |         4 |+-----------+-----------+

Note: Whenever you see "as" in a SQL statement - it's just an alias, a name to identify whatever field you selected. If we didn't use as total. We would see something like:

+-----------+-----------+| author_id | count(id) |+-----------+-----------+|         1 |         6 ||         2 |         4 |+-----------+-----------+

As you can imagine, if you working with a programming language or generating some kind of report. "count(id)" is not a very good label hence why we use the alias "total".

You can also use aliases to refer to tables.

3 - JOIN's

In SQL - you often spread data across multiple tables, this is known as normalization and is done in this way, to prevent repetition and keep your tables compact.

Since an Author can have hundreds of blog posts, it does not make sense to store the author's name, email, and other information against every single post in the posts table.

When you need to change the email for example, you now need to update 100+ rows in the posts table which is cumbersome and will be slow as the table grows.

A better approach, store the author in an "authors" table and add a foreign key "author_id" to the posts table.

This is where JOINs come in, they allow you to select data across multiple tables and merge those tables virtually in your select query as if it's one giant table.

There are many different types of JOINs, in this article - I am just going to cover "INNER JOINS". For these types of JOINs, the query will only select rows if there's data in both tables.

Coming back to our previous query in the group by, we select "author_id" - however this is not very useful, and more often than not, we would also want the author name.

Let us rewrite the above to pull in the author's name instead of author ID.

NOTE: depending on your db config. If you get an error - you may need to add all the selected fields after a.id to the group by.

SELECT a.name, count(a.id) as total FROM posts p JOIN authors a ON(a.id = p.author_id) GROUP BY a.id

Result:

+-----------+-----------+| name      | total     |+-----------+-----------+|      John |         6 ||      Paul |         4 |+-----------+-----------+

4 - Aggregate functions

Often you will need to count or total a column. MySQL offers many useful functions to handle these. The common ones are:

count()
sum()
avg()
min()
max()

NOTE: MySQL reserved keywords and functions are case insensitive, therefore COUNT() and count() are the same. We usually uppercase all function and reserved words so it's easy to read.

How to use them:

SELECT COUNT(id) as total_users FROM users;SELECT MAX(price) as max_price FROM sales;SELECT MIN(price) as min_price FROM sales;

You can also combine these with GROUP BY. In the above, we did "max_price", which basically selects the highest price an item was sold for.

However what if we want to know the highest-priced item sold by category?

SELECT category_id as cat_id, max(price) as max_price FROM sales GROUP BY cat_id;

Result:

+---------+------------+| cat_id  | max_price  |+---------+------------+| 1       | 55.34      || 2       | 9.99       || 3       | 12.00      |+---------+------------

You can of course JOIN categories to get the category name instead of the ID.

5 - Date functions

You almost always need to work with dates. In MySQL - these are the most common date functions/reserved words I use:

NOW() or CURRENT_TIMESTAMP - the current timestamp.
CURRENT_DATE - similar to NOW, except is only the date without time.
DATE_SUB(date, interval) - subtract x number of days/time from a date.

Examples:

SELECT title, category FROM posts WHERE the_date=CURRENT_DATE;# Select all posts that were updated in the last 5 days.SELECT title, category FROM posts WHERE updated_at >= DATE_SUB(NOW(), INTERVAL 5 DAY);

6 - Pagination

There are two popular types of pagination we can use in SQL:

Cursor pagination - incrementally move the cursor based on a column usually a primary key field. In large tables, this can improve performance quite a bit.
LIMIT & Offset - this is the most common. You basically use paging to move through records.

Examples:

# Cursor pagination# When results are returned, you simply take the last ID# in the results set and update "x" to move the cursor.SELECT title,descriptionFROM postsWHERE id > xORDER By id ASC# Limit & Offset# In your app you would have a page and per page variable:# page = 1 , perPage = 100# As the user moves from page to page # we just calculate the offset as follows:# offset = (page-1) * perPage# In SQLSELECT title,descriptionFROM postsORDER By id ASCLIMIT perPage OFFSET offset# Note LIMIT & Offset has a shortcut as well:SELECT title,descriptionFROM postsORDER By id ASCLIMIT offset, perPage

Bonus

Sometimes, you may need to run multiple queries since a JOIN might not be able to select all the information you need in one query.

Instead of running multiple queries, it may end up improving performance just to use a sub-query.

Examples:

SELECT p.title,p.description,(    SELECT COUNT(id)     FROM post_likes     WHERE post_id = p.id) as likes,c.name as categoryFROM posts pJOIN categories c ON (p.category_id=c.id)ORDER BY p.title

In the above, if you JOIN on post_likes - there could be hundreds of likes. This will lead to duplicate posts in your results, even if you have 50 posts but 5000 likes - the rows returned will be 5000 instead of 50.

Using a subquery - we get only 50 posts, but we also count the number of likes for each row which returns one column instead of each "like".

You could also use a group by together with a JOIN in this case, but it's not always the case and usually, a subquery is a last resort or used for performance reasons.

This brings me to the final query you need to know, and that is EXPLAIN.

When writing SQL - it's always a good idea to first put "EXPLAIN" in front of your query:

EXPLAIN SELECT title, category_id FROM posts where created_at=CURRENT_DATE \G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: posts         type: ALLpossible_keys: NULL          key: NULL      key_len: NULL          ref: NULL         rows: 6        Extra: Using where

Explain is a powerful tool, and will need an article, all of its own to explain 😀 how it works, however just in simple terms.

Check the "rows" count, the fewer rows you have to scan usually means a higher performing query.

These as well: key, key_len, extra - will also give you some insights - sometimes you may need to add an index to your table which will help speed up queries.

Conclusion

There is a ton more you can do with SQL, however, understanding these basic concepts will help you tremendously throughout your career. Most other complex SQL queries are built on these constructs.

Linux server essentials for web developers

Kevin Naidoo — Sun, 01 Oct 2023 22:00:00 GMT

If you work in the open-source world, this usually means deploying code to some Linux server or working on a Linux/Mac machine.

You will inevitably need to write some BASH code or run terminal commands. In this article, I'll cover some of the essential concepts you need to know to be productive on Linux servers.

Cron jobs

In most Linux systems, you can place cronjob tasks in the following folder:

/etc/cron.d/

Any standard Linux file name is fine, and the format of crons entered in this file should be:

*/15 * * * *  root /the/script/to/run

The format of specifying the schedule:

    .---------------- minute (0 - 59)    |  .------------- hour (0 - 23)    |  |  .---------- day of month (1 - 31)    |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...    |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon ...    |  |  |  |  |    *  *  *  *  *```bashYou can also use to manage the crons for the current user session:```bashcrontab -e

Systemd

On Ubuntu systems, you can set up a "supervisor" service that will startup and manage your application as a daemon process in the background.

Let's say for example you have a Golang binary that you want to start on system startup, and you want to be able to ensure that if it crashes for whatever reason, the server will automatically restart the application; you can easily achieve this with systemd.

Simply place a config file in:

/etc/systemd/system# Example/etc/systemd/system/myapp.service# Whenever you make updates to this folder - run the following:sudo systemctl daemon-reload# to enable your servicesudo systemctl enable myapp.service# to disable your servicesudo systemctl disable myapp.service

Once your service is enabled, you can then view its status or start/stop the process. Simply replace "enable" above with "status", "start", "stop" or "restart".

Here is a basic config example:

[Unit]Description=A microservice to handle paymentsAfter=network.target[Service]User=www-dataGroup=www-dataRestart=alwaysRestartSec=3WorkingDirectory=/var/www/wallet/ExecStart=/var/www/wallet/serverStandardOutput=/var/log/wallet.logStandardError=/var/log/wallet_error.log[Install]WantedBy=multi-user.target

Essentials bash commands

# search a file for a string. # -i: ignore case# -B: show x number of lines that appear before found result.# -A: show x number of lines that appear after found result.grep -i -A 10 -B 10 ports docker-compose.yml

# Stream the output of a log file to the terminal.# tail will stream the last line as # it's being written to the log.tail -f /var/log/nginx/error.log

# List all running processesps aux# Find a particular running processps aux | grep google-chrome# Kill a processkill -9 1234# Kill by namepkill -9 firefox

# Show headers for URLcurl -I 'www.website.co.za'# copy a file to a remote serverscp somefile -i ~/.ssh/auth.pem developer@192.168.1.1:/home/developer/# Start terminal session with remote serverssh -i ~/.ssh/auth.pem developer@192.168.11# Lookup commands you ran in the pasthistory | grep scp# Post JSON via curlcurl --location --request POST https://myapi.com/status/save \--header 'Content-Type: application/json' \--data-binary @- << EOF{    "Host": "$HOSTNAME",    "Status": "$STATUS",}EOF

Useful docker commands

# Bring up docker containersdocker-compose up -d --build# Take down containersdocker-compose down --remove-orphans# Bring up containers using a different .yml filedocker-compose -f the/file/path up -d --build# Read logs for a containerdocker logs -f container# List docker volumesdocker volume ls# Remove a volumedocker volume rm someVolume# Remove docker imagedocker imagesdocker rmi 98045bb148f1# Clean dangling containers and imagesdocker system prune# Create a docker networkdocker network create myapp-network# Docker run and expose host via to container# -v: binds a local folder as a volume# -e: Specify envs# -d: run as a daemon# --rm: delete container when stopping the container docker run --rm -d --add-host host.docker.internal:host-gateway -v ${PWD}:/app  -e VIRTUAL_HOST=myapp.dev --name devbox myapp/image

Resource usage

# Visual system resources viewerhtop# View disk space usagedf -h# Size of files and foldersdu -h# Find top three biggest files in foldersudo du -h /home/developer | sort -rh | head -n 3# Show memory stats in gigabytesfree -g# Find percentage utilisationfree | grep Mem | awk '{print $3/$2 * 100.0}'

Run processes without interruption:

# Open a seperate bash session in a virtual screen# This key combination will keep the screen open and take# you back to your previous session# ctrl + a + dscreen -S runSomeJob# Re-open a running screen - this will list open screens# or open your screen if you only have one active.screen -rscreen -r 97930.RunSomeJob# use ctrl + d in the screen session to terminate# Run a command in the background# will output logs to ~/nohup.outnohup bash run.sh &

Finally, some neat bash tricks

# Shortcuts, instead of typing this ssh command all the time# Simply enter srv01 in your terminal to run the full command.# add to your ~/.bashrc filealias srv01="ssh root@192.168.1.1"# Pre-load an environment when you terminal startssource ~/.venv/python3/bin/activate# Extend your terminal with themes and syntax completionshttps://starship.rs/guide/#%F0%9F%9A%80-installationcurl -sS https://starship.rs/install.sh | sh