Category Archives: DeepSeek

Deepseek (Shēndù qiúsuǒ in Chinese) is an application of artificial intelligence chatbot launched on January 10, 2025 by the Chinese Deepseek company, which specializes in dialogue

DeepSeek

DeepSeek

Deepseek (Shēndù qiúsuǒ in Chinese) is an application of artificial intelligence chatbot launched on January 10, 2025 by the Chinese Deepseek company, which specializes in dialogue

The chatbot is a language model fine-tuned with both supervised and reinforcement learning techniques

It is composed of models DeepSeek LLM, DeepSeek-V2, DeepSeek-V3, and DeepSeek-R1 of Deepseek

Background

In February 2016, High-Flyer was co-founded by artificial intelligence enthusiastic Liang Wenfeng, who had been operating since the financial crisis of 2007-2008 while attending the University of Zhejiang

In 2019, he established high-flyer as a coverage fund focused on the development and use of AI negotiation algorithms

In 2021, High-Flyer used exclusively in commerce

According to estimates of 36kr, Liang had accumulated a warehouse of more than 10,000 NVIDIA A100 chips before the United States government imposed restrictions on AI chips in China

Dylan Patel, from the semi -toalysis research consultant, estimated that Depseek had at least 50,000 chips

In April 2023, High-Flyer launched a general artificial intelligence laboratory dedicated to investigating the development of AI tools independent from the financial business of High-Flyer

In May 2023, with High-Flyer as one of the investors, the laboratory became its own company, Deepseek

Risk capital companies were reluctant to provide financing, since it was unlikely that they could generate a way out (return on investment) in a short period of time

After launching Depseek-V2 in May 2024, which offered great performance at a low price, Depseek became known as the Catalizer of the Price War of China's models of AI models

It was quickly called the "Pinduoduo of AI", and other important technological giants such as Bytedance, Tencent, Baidu and Alibaba began to reduce the price of their AI models to compete with the company

Despite the low price charged by Depseek, he was profitable compared to his rivals that were losing money

So far, Depseek focuses solely on research and has no detailed marketing plans

Deepseek hiring preferences focus on technical skills rather than work experience by recruiting new employees, so most of their new employees are recently graduate or developer university students whose careers are less established

Versions

DeepSeek LLM

On November 2, 2023, Depseek presented its first model, Deepseek Coder, which is available for free for both investigators and commercial users

The model code was made open under the MIT license, with an additional license agreement on the “open and responsible subsequent use” of the model itself

On November 29, 2023, Depseek launched Deepseek LLM, which climbed up to 67 billion parameters

It was developed to compete with other LLM available at that time with a performance close to that of GPT-4

However, he faced challenges in terms of computational efficiency and scalability

A chatbot version of the model called Depseek Chat was also launched

DeepSeek-V2

In May 2024, Depseek-V2 was launched

The Financial Times reported that it was cheaper than its peers with a price of 2 rmb per million departure tokens

The Tiger Lab classification of the University of Waterloo classified Deepseek-V2 in the seventh place of its LLM classification

DeepSeek-V3

In December 2024, Depseek-V3 was launched

He arrived with 671 billion parameters and trained in around 55 days at a cost of 5.58 million dollars, using significantly less resources compared to their peers

He trained in a data set of 14.8 billion tokens

The reference tests showed that it exceeded Llama 3.1 and Qwen 2.5 while matched GPT-4o and Claude 3.5 Sonnet

Deepseek's optimization of limited resources highlighted the potential limits of USA sanctions on China's development

An opinion article by The Hill described the launch as the American AI arriving at his "Sputnik moment"

The model is a mixture of experts with multi-head latent attention transformer, which contains 256 enrupted experts and 1 shared expert. Each active token 37 billion parameters and more

On January 27, 2025, the Artificial Intelligence Assistant of the China Depseek startup surpassed ChatGPT as the top-rated free app on the USA App Store

It has caused debates about the effectiveness of USA export restrictions on advanced artificial intelligence chips to China

The DeepSeek-V3 model, which uses Nvidia's H800 chips, is gaining recognition for its competitive performance, challenging the global dominance of USA AI models.

DeepSeek R1

In November 2024, Depseek R1-Lite-Preview was launched, which was trained for logical inference, mathematical reasoning and real-time problem solving

Deepseek said that OpenAI o1's performance exceeded reference points such as Invitational Mathematics Examination (AIME) and MATH

However, The Wall Street Journal said that when he used 15 problems of AIME's 2024 edition, the o1 model reached a faster solution than Deepseek R1-Lite-Preview

On January 20, 2025, Depseek-R1 and Deepseek-R1-Zero were launched

They were based on V3-Base

Like V3, each is a mixture of experts with 671b of total parameters and 37b of activated parameters

They also launched some “Deepseek-R1-Distill” models, which are not based on R1

On the other hand, they are similar to other open weight models such as LLaMA and Qwen, adjusted with synthetic data generated by R1

R1-Zero trained exclusively through reinforcement learning (RL), without any supervised learning (SFT)

It was trained using group relative policy optimization (GRPO), which estimates the baseline from the group's scores instead of using a critical model

The reward system used is based on rules and consists mainly of two types of rewards: precision rewards and format rewards

The results of R1-Zero are not very legible and change between English and Chinese in them, so they trained it to address these problems and further improve reasoning

Concerns

Censorship

Some sources have observed that the official API version of R1 uses censorship mechanisms for issues that are considered politically sensitive to the Government of the People's Republic of China

For example, the model refuses to answer questions about the protests of the Tiananmén Plaza in 1989, the persecution of the Uigures or Human Rights in the People's Republic

AI can initially generate an answer, but shortly after it eliminates and replaces it with a message such as:

Sorry, that is beyond my current reach. Let's talk about something else

Integrated censorship mechanisms and restrictions can only be eliminated in the open source version of the R1 model

If you touch the fundamental socialist values Defined by Chinese Internet regulatory authorities or the political status of Taiwan is raised, discussions are terminated

When it was tested by NBC News, Deepseek's R1 described Taiwan as an inalienable part of the territory of China and declared:

We firmly oppose any form of separatist activity of Taiwan independence And we are committed to achieving the complete reunification of the homeland through peaceful means

Western researchers could in January 2025 deceive Deepseek to give precise answers to some of these issues by adapting the question asked

Security and privacy

There is also fear that the AI ​​system can be used for foreign influence operations, disinformation, surveillance and development of cyber weapons for the Government of the People's Republic of China China

Deepseek's privacy terms and conditions establish the following:

We store the information that we collect in safe servers located in the People's Republic of China ... We can collect your text or audio entry, indications, charged files, comments, chat history or other content that provides our model and services

Although the data storage and collection policy is consistent with the privacy policy of ChatGPT, A press article reports that this represents a security problem

In response, the Italian Data Protection Authority is looking for additional information on the collection and use of personal data by Depseek and the United States National Security Council announced that it had initiated a national security review

However, when devile is used locally, the data is not shared publicly

Using DeepSeek

Account creation

Before being able to use Depseek it is necessary to have a registered account in the Deepseek system

For this we will use the following registration link

We will introduce our email and a password or use our Google account

We will go to our email account, which must be valid and real

And we will confirm the account registration by clicking on the verification email that they will send us

We will continue entering the rest of the user data requested in the form

And we can start using the Depseek Promot Chat Time whenever we accredit with the user and password of the account we have created

Installation at home

Apart from the Official Depseek API, we can also install the model locally on our device, for this we will use the client for artificial intelligence models Ollama

Ollama

Ollama is a program that you can install on any computer, both with operating system Windows as with macOS or GNU/Linux

It is a customer of artificial intelligence models, so it is the basis on which to then install the AI ​​you want to use

Ollama has two particularities

  • Allows you to use an AI locally

    This means that instead of going to the chat page with artificial intelligence of a company, the model is installed on your computer and you use it directly without entering any website

    That favors us in the following ways:

    • The data of everything you do stay on your PC, so that no company uses them

    • You can use AI no Internet connection
    • You can skip censures that has an artificial intelligence model you are using on a website
    • However, what you cannot do is searches online to complete the information
  • It works through your computer terminal (the system symbol in Windows, a shell on macOS systems or GNU/Linux)

    This does not have to use a separate application

    When I install Ollama, then you will have to use the console of your device to install and execute in it the model you want, and the questions and the prompts you write them in the console, where you will also get their answers

Installing Ollama in Windows

It is as simple as accessing their website and click on the Download button

Now, you must choose the Windows Where do you want to install it (the minimum version is Windows 10)

Once chosen, click on the Download button

By default the web will show the system you are using, but you can download the executable of any other

When you download it, launch the installation program

Install Ollama is very simple, you just have to click on the next button on the presentation screen, and then click on the Install button on the installation screen

Once you have installed Ollama, launches the application

You will see that nothing happens (as an icon appears in the taskbar), this is because you have to open the terminal of your computer (with administrator permits), which in Windows It is called the system symbol

Now, before you start you have to go to the website where you'll see all available AI models

Since we want to use DeepSeek go to the web And you will get all the available links of that model

Choose well depending on the capacity of your machine in GB of memory and available hard disk space, use the model information to guide you

To make the examples I am going to use deepseek-coder-v2, since my machine only has 12 GB of RAM and the model occupies 8.9 GB on the hard drive

Once chosen, search the model information a right -hand tab that has a button that allows you to copy the text, since we will use it in the system symbol

In my case:

And I simply hit it in the system symbol and wait for the Deepseek model to be installed (only the first time) and that the cursor is put in promp mode, responding for the first time the model

The next time you use the model, you must hit the command again, but it will take less to answer because it will already be installed

Installing ollama on Android

Before starting, our device Android need to meet the following prerequisites:

  • At least 4.5 GB RAM
  • A stable Internet connection to download the thermux, Ollama and Deepseek model
  • Android >= 7

In addition to Ollama, we are going to use the terminal emulator thermux

To install it we must go to the Thermux Development Website and choose the latest stable APK version for your version of Android

It can also be found in the PlayStore, But the latest stable version does not match or is older than the one you can find in the Thermux Development Website

We install the APK file on our device Android

We open Termux to access the terminal

Once within the terminal we need to grant Termux access to the storage of the device

To do this we will execute:

To have Termux and the updated packages we will run:

We wait for the update process to end

Now we will install Ollama executing the following command:

Now we will start ollama with the following command:

Now, before you start you have to go to the website where you'll see all available AI models

Since we want to use DeepSeek go to the web And you will get all the available links of that model

Choose well depending on the capacity of your machine in GB of memory and available hard disk space, use the model information to guide you

To make the examples I am going to use deepseek-coder-v2, since my machine only has 12 GB of RAM and the model occupies 8.9 GB on the hard drive

Once chosen, search the model information a right -hand tab that has a button that allows you to copy the text, since we will use it in the system symbol

In my case:

And I simply hit it in the system symbol and wait for the Deepseek model to be installed (only the first time) and that the cursor is put in promp mode, responding for the first time the model

The next time you use the model, you must hit the command again, but it will take less to answer because it will already be installed

Privacy

You have to be special care when we use Deepseek

By default our conversations are stored in a history and can be used to continue training Deepseek

If we do not want our data to be used to train, there is an option in the configuration of our account to deactivate the use of our data to train Deepseek (and the history of conversations)

PROMP

Depseek is trained to follow and execute the instructions that we provide

Our instructions are called prompts

They can be as simple or complex as we want, and can include additional information

For example, an example text, an image, a link to a web page...

We can "talk" to Depseek interactively

For example, asking you to complete or correct your previous answer

That means we can ask him something, and then refer to either our previous question or his answer:

We ask you to do it a little more serious

Effective PROMPS

Deepseek is quite literal interpreting our instructions, so it is convenient that we give all the necessary information to complete your tasks according to our expectations

Overall, a good prompt must include:

  1. Role: for Deepseek (expert in ..., assistant of ...)
  2. Context: the situation relative to the text we have to generate
  3. Instructions/tasks: What we need to do you do for us
  4. Format/style: if we want a formal letter, more modern, aggressive style... or if we need the response to be formatted in JSON for example

You can even ask Deepseek himself to give you more advice

Normally the PROMPTS are in English, but remember that you can ask Depseek to translate them into your language

Examples

Fairy tale

Let's ask you to write a fairy tale with a happy ending

We ask you to modify the ending of the story for a sadder one

We ask you to generate a moral to the story

Finally we ask you to generate the story but with Hansel and Gretel, encountering a dragon and a fairy godmother giving them advice to defeat it

Letter

We will simulate that we are an employee of the General Directorate of Public Works using Deepseek to help you in your day to day

We ask you to generate a letter to inform a user that a pipeline is going to be carried out that will pass through their property

We ask that you generate a second letter informing the user that their request to stop the work has been rejected

What can you do

There are many things that Depseek can do very well, just ask for them

Below we present some examples of applications:

Creative tasks

  • Generate fictional stories
  • Generate technical documentation (if we provide you with enough information)
  • Texts for project proposals
  • Reports
  • Letters
  • Brainstorming: Product names, titles of works...

Training

  • Create text summaries
  • Generate activities, test-type exercises
  • Plan classes
  • Generate agendas
  • Code

Proofreading

  • Review texts to correct grammar and spelling
  • Change style:
    • Depending on the audience (for a high school student, for a scientist...)
    • Depending on the role of Deepseek ("Talk to the style and vocabulary of a professor of university literature/of a high school student ...")

Translation

  • Deepseek has been trained with a corpus that includes a large number of languages ​​and we can ask you to translate from/to them

    It is convenient that after making a translation, we ask Depseek to review the text, correct literal translations and adjust the style and language to our audience

  • You also know a large number of programming languages ​​and we can ask you to transform code from one language to another
  • We can also ask you to transform file formats (for example, data from csv format to json)

Code

  • Code generation following instructions

    We can specify whether we need type annotations (for example in Python) or unit tests

  • Code validation
  • Function Explanation
  • Refactoring: using another library, changing variable names...

Reasoning

We can raise problems, challenges and complex issues to Deepseek based on well -specified assumptions and facts

To resolve it, the following promting techniques can be performed:

In the next link You can find these techniques with more details about them and also some more complex techniques

IO (Direct Input/Output)

It is the most basic method, it consists in asking Depseek directly about the answer to our problem

It works correctly with simple problems, but will fail on complex problems

Although with the latest Deepseek training has learned to reason for steps even if we do not explicitly ask, and on many occasions it will generate the correct answer without the need for additional help

IO with refinement

A method that provides good results: we ask Deepseek to respond to our problem

And then in successive prompts we ask Deepseek to review and improve your answer

CoT (Chain of Thoughts)

We can explicitly ask Deepseek to reason about each stage of the process, or show an example with that reasoning for him to repeat it

CoT-SC (Chain of Thoughts – Self Consistency)

We will perform Chain-of-thought reasoning several times, and then we will select the most repeated answer (the most consistent between the different executions)

Tree-of-thoughts

We generate a prompt that allows Deepseek to explore different ways of thought critically, until you find a satisfactory solution

Limitations

But it is important to know them to avoid "surprises" when using Depseek

Provide complex answers if we do not give them time to reason

If we ask Deepseek to answer us to a complex question, Depseek will try to ask it in small steps

The output of each step serves as support for the following reasoning, and greatly improves your results

For example we ask you:

Now we ask you, letting you “think”:

When we have let him “think”, the result has been much better

Complex mathematical operations

The results will return them using the LaTeX label language

For example, multiplying numbers with more than 3 digits

Deepseek's response will normally approach the correct value (in this case 553,254), but it will not be exact

For example, operations with large square roots

The solution will be close to the correct value (in this case 44.988887516807970076138159027823), but it will not be exact

In seemingly simple arithmetic operations, we will begin to see effects with relatively large numbers

For example, we will probably get the correct answer for the value of 3 cubed but not for 333 cubed

Once you have given us an answer, if we do not correct it, it will give that result for good, and it is quite likely that Deepseek will reuse it in subsequent answers about the same operation

Provide or verify factual information

When we ask you for factual information, you do not have to answer us with reality

Even if it claims otherwise, it cannot verify whether the information is true or not with certainty

You also cannot give us the source of your data (it is not something supported by the algorithm)

For example, we are going to ask you about the equestrian monument to Espartero, but we want information about the one in Logroño, since there are several in Spain

We insist that we want the one from Logroño

He has answered that there is no, when there is, this is a case of hallucination

Although he has given us in an polite way, useful information about the city and its monuments

We politely ask you to list us if there are more in Spain

Access to updated information

The training corpus varies over time, but is static at any given time

We are going to try to ask him on April 18, 2025, when Akira Toriyama died

On the date of April 18, 2025, he could not answer me when Akira Toriyama died (which has been on March 1, 2024 with 68 years)

In fact he considers that he is still alive, but very kindly he tells me about his professional career

Know what date and time we are at

Deepseek presents erratic behavior in this regard, sometimes he answers, sometimes he says he does not have that information, and sometimes gives us a wrong date

We are going to try to ask you on April 18, 2025, what time is it

Access to information about Deepseek itself

Deepseek does not know its current version, the value of its configuration parameters, etc.

As with the factual data, sometimes Depseek can answer us as if he knew the answer, but we cannot trust that it is true