Open WebUI Transformer Error

Ran into issues with your Open WebUI container after setting a custom RAG model from Hugging Face? This post covers troubleshooting steps, from identifying config.json errors to fixing the sentence transformer settings, to get your AI server back up and running smoothly.

Open WebUI Transformer Error
Photo by Kevin Ku / Unsplash

Well I did a thing and caused my Open WebUI container to stop working. I changed a RAG setting to use a custom model from Hugging Face and that lead to the container not being able to start anymore. The game we play when we self host.

For those not as familiar with the term RAG (Retrieval-Augmented Generation), it essentially refers to a hybrid model that pulls in external information (retrieval) and then uses a generation model to produce the final output. Think of it like adding extra brainpower to your transformer—giving it the ability to search for relevant information before generating a response. Instead of training a model, you just use RAG to augment and extend an existing model with your own data.

I made a change

I have been playing around with all the cool features that are in Open WebUI. I really enjoy having the AI server in my home where we can keep our data, or use a public model when we need to. We can also limit how much we are contributing to the ever growing energy needs of these AI companies. With ollama, you only put the AI model you need into memory, and then it removes it from memory when idle. It's more of an on-demand AI server, made even better if you use smaller AI models for most things.

The last bit of settings I was playing around with in Open WebUI is the documents sentence transformers. I wanted to play around with the setting to see if other models could make the RAG process even better. So I went to Hugging Face and found a model that I saw others online using. After I set it... I didn't test. I stopped for the day because homelab isn't my 9-5. Setting the sentence transformer to BAAI/bge-en-icl would then cause RAG not to work and for the container to not start anymore.

I should have noticed

The first sign that something was wrong was when someone tried to upload a document the server would show the document as processing indefinitely. There would never actually be feed into the model that I selected. When I was informed I thought I would just take a look at the server logs later to determine what was wrong. Well... I didn't, I just forgot about it.

Updates

The next day, after forgetting about the issue, I then checked on my server to determine if there are any updates. Of course, Open WebUI had an update available. So, instead of checking the logs like I should have remembered, I decided to update the container which then lead to the container no longer starting anymore.

Logs

Good thing this server was written in Python, it made looking at the logs a little easier. I could see the container was constantly starting, and in the container was exiting with the following command:

jq: error (at :0): break

Stack Traceback

It also had the following stack traceback:

    transformer_model = Transformer(
                        ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentence_transformers/models/Transformer.py", line 58, in __init__
    self.tokenizer = AutoTokenizer.from_pretrained(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 897, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2271, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2505, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 157, in __init__
    super().__init__(
  File "/usr/local/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 106, in __init__
    raise ValueError(
ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed.

The Fix

Luckily there's an easy fix for this, the hint is the packages/sentence_transformers line in the traceback. That points at the fact there's an issue in the sentence transformer. I would much rather just set that back to default than to start all over with Open WebUI.

All you need to do is edit the config.json file in your openweb-ui config directory. This is the broken RAG section of the config:

"rag": {
                "pdf_extract_images": true,
                "youtube_loader_language": [
                        "en"
                ],
                "enable_web_loader_ssl_verification": null,
                "web": {
                        "search": {
                                "enable": true,
                                "engine": "google_pse",
                                "searxng_query_url": "",
                                "google_pse_api_key": "NONE",
                                "google_pse_engine_id": "NONE",
                                "brave_search_api_key": "",
                                "serpstack_api_key": "",
                                "serpstack_https": true,
                                "serper_api_key": "",
                                "serply_api_key": "",
                                "tavily_api_key": "",
                                "result_count": 5,
                                "concurrent_requests": 10
                        }
                },
                "template": "Use the following context as your learned knowledge, inside <context></context> XML tags.\n<context>\n    [context]\n</context>\n\nWhen answer to user:\n- If you don't know, just say that you don't know.\n- If you don't know when you are not sure, ask for clarification.\nAvoid mentioning that you obtained the information from the context.\nAnd answer according to the language of the user's question.\n\nGiven the context information, answer the query.\nQuery: [query]",
                "top_k": 5,
                "relevance_threshold": 0.0,
                "enable_hybrid_search": true,
                "embedding_engine": "",
                "embedding_model": "BAAI/bge-en-icl",
                "reranking_model": "",
                "CONTENT_EXTRACTION_ENGINE": "",
                "tika_server_url": "http://tika:9998",
                "chunk_size": 1500,
                "chunk_overlap": 100
        }

The issue was on line 29. I reset it to an empty string, and the container started right up. Then, I went into the UI, switched the embedding engine model to ollama, and back to Default. This auto-filled the default embedding model. After that, I clicked the model download button, and everything was back up—our original sentence transformer running smoothly again 😄.

If you run into something similar, hopefully this post helps you resolve the issue and get your server back up and running.