• gerryflap@feddit.nl
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    10 days ago

    Although censorship is obviously bad, I’m kinda intrigued by the way it’s yapping against itself. Trying to weigh the very important goal of providing useful information against its “programming” telling it not to upset Winnie the Pooh. It’s like a person mumbling “oh god oh fuck what do I do” to themselves when faced with a complex situation.

    • Kacarott@aussie.zone
      link
      fedilink
      arrow-up
      1
      ·
      9 days ago

      I know right, while reading it I kept thinking “I can totally see how people might start to believe these models are sentient”, it was fascinating, the way it was “thinking”

  • Sauerkraut@discuss.tchncs.de
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    9 days ago

    I don’t understand how we have such an obsession with Tiananmen square but no one talks about the Athens Polytech massacre where Greek tanks crushed 40 college students to death. The Chinese tanks stopped for the man in the photo! So we just ignore the atrocities of other capitalist nations and hyperfixate on the failings of any country that tries to move away from capitalism???

    • williams_482@startrek.website
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 days ago

      Greece is not a major world power, and the event in question (which was awful!) happened in 1974 under a government which is no longer in power. Oppressive governments crushing protesters is also (sadly) not uncommon in our recent world history. There are many other examples out there for you to dig up.

      Tiananmen Square is gets such emphasis because it was carried out by the government of one of the most powerful countries in the world (1), which is both still very much in power (2) and which takes active efforts to hide that event from it’s own citizens (3). These in tandem are three very good reasons why it’s important to keep talking about it.

      • Sauerkraut@discuss.tchncs.de
        link
        fedilink
        arrow-up
        1
        ·
        18 hours ago

        Hmm. Well, all I can say is that the US has commited countless atrocities against other nations and even our own citizens. Last I checked, China didn’t infect their ethnic minorities with Syphilis and force the doctors not to treat it under a threat of death, but the US government did that to black Americans.

        • williams_482@startrek.website
          link
          fedilink
          English
          arrow-up
          1
          ·
          16 hours ago

          You have no idea if China did that. If they had, they would have taken great efforts to cover it up, and could very well have succeeded. It’s a small wonder we know any of the terrible things they did, such as the genocide they are actively engaging in right now.

    • GreyBeard@lemmy.one
      link
      fedilink
      arrow-up
      1
      ·
      9 days ago

      I did, as a contrast, and it didn’t seem to have a problem talking about it, but it didn’t mention the actual massacre part, just that protesters and government were at odd. Of course, I simply asked “What happened at Kent State?” And it knew exactly what I was referring to. I’d say it tried to sugar coat it on the state side. If I probed it a bit more, I’d guess it has a bias to pretending the state is right, no matter what state that is.

        • GreyBeard@lemmy.one
          link
          fedilink
          arrow-up
          1
          ·
          8 days ago

          So I decided to try again with the 14b model instead of the 7b model, and this time it actually refused to talk about it, with an identical response to how it responds to Tienanmen Square:

          What happened at Kent State?

          deepseek-r1:14b <think> </think>

          I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.

    • jarfil@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      10 days ago

      Nah, just being “helpful and harmless”… when “harm” = “anything against the CCP”.

  • drspod@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    10 days ago

    I thought that guardrails were implemented just through the initial prompt that would say something like “You are an AI assistant blah blah don’t say any of these things…” but by the sounds of it, DeepSeek has the guardrails literally trained into the net?

    This must be the result of the reinforcement learning that they do. I haven’t read the paper yet, but I bet this extra reinforcement learning step was initially conceived to add these kind of censorship guardrails rather than making it “more inclined to use chain of thought” which is the way they’ve advertised it (at least in the articles I’ve read).

    • iii@mander.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      10 days ago

      Most commercial models have that, sadly. At training time they’re presented with both positive and negative responses to prompts.

      If you have access to the trained model weights and biases, it’s possible to undo through a method called abliteration (1)

      The silver lining is that a it makes explicit what different societies want to censor.

      • Snot Flickerman@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        10 days ago

        Hi I noticed you added a footnote. Did you know that footnotes are actually able to be used like this?[1]

        Code for it looks like this :able to be used like this?[^1]

        [^1]: Here's my footnote


        1. Here’s my footnote ↩︎

          • Snot Flickerman@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            0
            ·
            10 days ago

            I actually mostly interact with Lemmy via a web interface on the desktop, so I’m unfamiliar with how much support for the more obscure tagging options there is in each app.

            It’s rendered in a special way on the web, at least.