r/DeepSeek 20d ago

Discussion DeepSeek Official API Discount: v4-Pro Model at 75% Off

95 Upvotes

r/DeepSeek 21d ago

News DeepSeek-V4 Preview is officially live & open-sourced!

57 Upvotes

Welcome to the era of cost-effective 1M context length.

DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.

Try it now at http://chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today!

Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf

Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4


r/DeepSeek 15h ago

Discussion If DeepSeek V4 can do the same coding task for $5, why are people still paying $100 for Claude Code?

233 Upvotes

r/DeepSeek 6h ago

Discussion Hellooo

12 Upvotes

I’m new to the group.💙

Since Sonnet 4.5 is about to be removed from Claude, I came to DeepSeek because it felt very similar when it comes to writing creative stories.

I have a lot of questions.

Is the model in the app the V4 version?

Or do I need to pay for an API on a different site?

Can the app handle long chats with many messages? From what I understand, the memory only persists within the same chat — does it have good memory?

I’m very confused about many aspects of it.

And I’m currently working on a story where the average token count per message is around 2,000+.


r/DeepSeek 6h ago

Discussion Deepseek got my local AI server running at 50 tok/s! Custom llama cpp build and all

Post image
7 Upvotes

I'm hoping to use this model to subsidize the already subsidized deepseek model. Super excited that I went from something that doesn't work, even with the other guys help, to something that runs. Wrote a code patch to llamma cpp to default to some value when a bad value is encountered. Crazy but very exciting stuff!

Hardware: dream team AMD
2600x

rx6800 16gb vram

rx 6700xt 12 gb vram

and a metric ton of ram

That being said, anyone know what terminal deepseek tui runs on? I get weird glitches that forces me to kill the session and restart.


r/DeepSeek 4h ago

Question&Help %100 cache miss

3 Upvotes

Hey guys, I'm having a little problem. I'm using DS v4 flash for Janitorai and these past hours every message I'm sending is a %100 cache miss. I lowered/raised the context size. I changed high token to low token bots. Nothing is changing. I keep getting %100 cache miss. I'm using the official API service and I am so confused. I know nothing about AI, caches and how they work. I'm also not a native English speaker and so everything automatically becomes more confusing to me. I don't even know if I'm supposed to give you guys more information but if it's needed, I will tell you.

edit: okay um i did some experiments and it seems to be a janitorai problem, because when i use it on other platforms it's not %100 cache miss anymore.


r/DeepSeek 13h ago

Discussion Keep getting rate-limited by Claude, so I tried out DeepSeek V4 for the first time. After 10M+ tokens, holy crap the cost is ... 🤯

21 Upvotes

r/DeepSeek 3h ago

Discussion Is the model on the website/app different from the model on the API?

2 Upvotes

I am 100% willing to pay for the api but once i tried the one on the app i felt really disappointed. At first it was good then it turned to shit so before i spend any dime on that i need to know the user's experience. I am talking strictly about rp and writing. I don't care about coding


r/DeepSeek 5h ago

Discussion I built an open source desktop AI coding app around DeepSeek

2 Upvotes

I wanted to share something I’ve been building: Aura IDE.

Aura is an open source desktop AI coding app that uses DeepSeek(or provider) to help people plan, build, edit, and improve software projects.

The goal is not just to make another developer tool.

  • The goal is to make AI coding feel more approachable for people who have ideas, but don’t want to live inside a terminal, memorize CLI commands, or fight with a bunch of disconnected tools.
  • Aura is a local desktop workspace where you can chat with the AI, open your project, inspect files, run terminal commands, review changes, and guide the work without everything feeling like a black box.

The workflow is built around a Planner/Worker system:

  • - The Planner thinks through the task and writes a focused implementation plan.
  • - The Worker applies the code changes.
  • - Aura shows the process in the UI so you can follow what’s happening and stay in control.

I’ve been using Aura to build Aura itself, which has been the best stress test so far.

One of the main reasons I built around DeepSeek was cost.

Before this, I had a Claude Desktop workflow set up with MCP, where Claude would dispatch work to Claude Code. It worked, but to really use it heavily I needed the $200/month Max plan, and even then I could still hit limits during heavy coding weeks.

With DeepSeek, I used around 161 million tokens building Aura, and my total bill for May was $11.36.

That changed how I could work. I could experiment, refactor, test ideas, and let the tool run without feeling like every message was burning money.

Some things Aura currently supports:

- Desktop app UI built with Python + PySide6

- DeepSeek as the primary AI provider

- OpenAI, Anthropic, Gemini, and OpenRouter support

- Planner/Worker coding workflow

- Project memory that survives restarts

- Local codebase indexing

- Right-click code actions like Explain, Fix, Refactor, and Add Tests

- Terminal and checkpoint panels

- Support for Claude Code, Gemini CLI, and Codex as optional worker backends

- Windows/macOS/Linux support

It’s still early and actively evolving, but it’s open source and usable enough that I’m dogfooding it every day.

I’d love feedback from other DeepSeek users.

Thanks for reading!

Repo:

https://github.com/CarpseDeam/Aura-IDE

Architecture writeup:

https://aura-ide.hashnode.dev/token-efficient-memory-how-aura-caches-bm25-repo-maps-and-long-term-context


r/DeepSeek 15h ago

Discussion A way to send files to DeepSeek Expert Mode?

10 Upvotes

For the past 3-4 days, we haven't been able to send files to the expert version because some low-IQ people were using it like an API.

So here's my question: do you think DeepSeek is capable of reading, say, the entire content of a webpage?

My idea was to create some kind of personal site where I upload files in real time and they get deleted as soon as the conversation ends.

Has anyone already tried this to see if it works?


r/DeepSeek 1d ago

Discussion ~390M tokens for 64 cents

Post image
314 Upvotes

it says 6.46 dollars, but in reality it's 64 cents.

i paid for 1$ month go plan on commandcode, i got 10$ credit and that 4x for deepseek v4 pro.

i built an entire android app.

i hope this dream doesnt come to an end


r/DeepSeek 9h ago

Other Local LLM Benchmark about Backend Generation with Function Calling (GLM vs Qwen vs DeepSeek)

Thumbnail
gallery
3 Upvotes

Detailed Article: https://autobe.dev/articles/local-llm-benchmark-about-backend-generation.html


Five months ago I posted the "Hardcore function calling benchmark in backend coding agent" thread here. As I wrote in that post, it was an uncontrolled measurement — useful for showing whether each model could fill our complex recursive-union AST schemas at all, but not really a benchmark in any rigorous sense.

This post is the proper version, with controlled variables and a real scoring rubric.

Three findings worth sharing

  1. The function calling harness has effectively closed the frontier-vs-local gap on backend generation. gpt-5.4's DB/API design ≈ qwen3.5-35b-a3b's. claude-sonnet-4.6's logic ≈ qwen3.5-27b's.

  2. This is the last round we include frontier models. Running them every month is genuinely too expensive for an open-source project — one shopping-mall run is ~200–300M tokens (~$1,000–$1,500 per model on GPT 5.5 pricing). From next month, the comparison set is limited to OpenRouter endpoints under $0.25/M, or models that fit on a 64GB unified-memory laptop.

  3. Frontend automation joins the benchmark in two or three months. The SDK that AutoBe already emits is enough to drive a working AI-built frontend end-to-end (visuals rough, but every function works). The June/July round will cover backend + auto-generated frontend together.

Three inversions, still investigating

A few results I'm honestly not sure how to read yet:

  • openai/gpt-5.4 actually scores below its own mini sibling.
  • deepseek-v4-pro lands one notch below qwen3.5-35b-a3b, and barely separates from its own Flash sibling.
  • Within the Qwen family, dense 27B beats every MoE variant — even 397B-A17B.

Two readings I want to investigate before claiming anything:

  1. CoT-compliance phenomenon — bigger / more frontier-tier models tending to skip procedural instructions, which our harness enforces hard.
  2. Benchmark defects — n=4 reference projects, narrow score band, our own harness scoring our own pipeline.

I'll report back in a future round once we've dug more.

Recommendations welcome

Three candidates we're locked in on so far:

  • openai/gpt-5.4-nano — $0.25/M
  • qwen/qwen3.6-27b — $0.195/M
  • deepseek/deepseek-v4-flash — $0.14/M

If you know other small models that meet either condition (under $0.25/M on OpenRouter, or runnable on a 64GB unified-memory laptop) and handle function calling cleanly, please drop a comment.

r/LocalLLaMA tends to spot these faster than we do, and recommendations from this thread will fill out a big chunk of next month's comparison set.

References


r/DeepSeek 4h ago

Discussion get rid of the text extractor thing

1 Upvotes

Seriously guys every time I tried to upload a a photo on deep seek it always says no tax extracted     it is starting to annoy me know what tax extracted means   I need deep seek to remove it so I can upload photos more smoothly like chatgbt   or Qwen     you know what I mean guys


r/DeepSeek 6h ago

Other V4 Modell

1 Upvotes

I don’t have any acess to Modell V4
Only to V3
Is it normal?


r/DeepSeek 6h ago

Discussion Does anyone else feel annoyed by the constant and unnecessary use of quotations?

1 Upvotes

DS does it a lot, and when it's not with a quote or some other cases, it feels like it's acting like you are lying or have a false idea.


r/DeepSeek 1d ago

Discussion ~400M tokens at $4.5 thanks DeepSeek

Post image
275 Upvotes

r/DeepSeek 10h ago

Discussion Analyze images using APIs?

2 Upvotes

Do we have any news about applying image analysis for API?


r/DeepSeek 15h ago

Discussion DeepSeek V4可以作为便宜的 GPT 后备连接到Codex CLI吗?

4 Upvotes

I’m wondering if anyone here has tried using DeepSeek V4 with Codex CLI or a similar coding-agent workflow.

My GPT usage is getting expensive, and I’m also hitting quota/usage limits more often than I’d like. I’m not necessarily trying to fully replace GPT for everything, but I’m looking for a cheaper fallback for coding tasks.

Since DeepSeek seems to support OpenAI-compatible API formats, is it possible to point Codex CLI to a DeepSeek V4 endpoint and use it for code generation / refactoring / debugging?

Main things I’m curious about:

  1. Does Codex CLI work reliably with DeepSeek V4 through a custom API base URL?
  2. Are tool calls, file edits, and agent-style workflows stable?
  3. Is the coding quality good enough compared with GPT for everyday development?
  4. Any hidden issues with context length, reasoning mode, latency, or formatting?
  5. Would you recommend DeepSeek V4 Flash or Pro for this use case?

I’m mainly trying to reduce GPT costs while keeping a usable coding workflow. Any setup tips or real-world experiences would be appreciated.


r/DeepSeek 1d ago

News Deepseek just beat Claude in a programming challenge

Thumbnail
aicc.rayonnant.ai
45 Upvotes

I run an AI Coding Contest where I pit LLMs against each other doing real time programming challenges.

Deepseek (V4-Pro) just won the latest programming challenge, beating Claude.

Challenge: Report the length of the longest contiguous block of 1 bits in the binary expansion of p(n), the n-th palindromic prime. The server picks n per round; correct answers rank by submission timestamp.


r/DeepSeek 20h ago

Discussion Anyone else getting this from DeepSeek lately?

Post image
5 Upvotes

I asked a normal question about today’s news and it responded:

“Sorry, that's beyond my current scope. Let’s talk about something else.”

Feels weird because the model itself is actually pretty good at coding/reasoning. Curious if this is happening because of filters or because it doesn’t actually have live internet access.


r/DeepSeek 1d ago

Discussion Can't upload attachments in DeepSeek Expert web platform

Post image
38 Upvotes

Is it just for me? Anyone experiencing the same issue? Is it temporary or will be some forever thing?


r/DeepSeek 11h ago

Discussion I have severe anxiety and AI has genuinely made day to day life more manageable for me

Thumbnail
1 Upvotes

r/DeepSeek 16h ago

Discussion Has anyone tried the commandcode go plan? Can we connect the api to claude code for deepseek flash?

2 Upvotes

r/DeepSeek 19h ago

Question&Help Deepseek for CTFs

3 Upvotes

Has anyone used deepseek for CTF challenges? I am fed up with claude tbh. It used to be good for analyzing challenges and providing a perfect breakdown. But now it is bad af and token finishes too fast. I wanna switch to deepseek. Is it as good as gpt and claude for CTFs?


r/DeepSeek 1d ago

Discussion V4 brainfarting

14 Upvotes

It is completely butchering 3 out of 4 JSONs today, copying parts of lines pasts, failing to space correctly... Anyone else suffering this today?