Hacker News new | past | comments | ask | show | jobs | submit login

Optillm - https://github.com/codelion/optillm

optillm is an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs. The current focus is on implementing techniques that improve reasoning over coding, logical and mathematical queries. It is possible to beat the frontier models using these techniques across diverse tasks by doing additional compute at inference time.




This is amazing, thanks for sharing. I'm implementing some of these techniques myself right now, but being able to try out different algorithms and having plugins etc available immediately is really cool! Can't wait to try it out.

How are you dealing with structured outputs?


>How are you dealing with structured outputs?

The models have gotten much better at generating them with just the prompt. I have not implemented strict support for structured output or JSON generation yet. The response from the proxy are all raw text responses.

One way would be to just apply outlines or some library as a plugin to enable structured outputs.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: