A Java Geek weekly 134

Using my new Raspberry Pi to run an existing GitHub Action. Syncthing. Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs. Highlights from Git 2.54. Spring creator wants Java’s type system to tame agentic AI. How to Write Your First OpenTelemetry Declarative Config File with Trace. I Used to Love Coding. Now I Just Prompt. OpenAPI 3 → idiomatic Kotlin. Reducing our monorepo size to improve developer velocity.

Using my new Raspberry Pi to run an existing GitHub Action

Recently, I mentioned how I refactored the script that kept my GitHub profile up-to-date. Since Geecon Prague, I’m also a happy owner of a Raspberry Pi.

Though the current setup works flawlessly - and is free, I wanted to experiment with self-hosted runners. Here are my findings.

Syncthing

Syncthing is a continuous file synchronization program. It synchronizes files between two or more computers in real time, safely protected from prying eyes. Your data is your data alone and you deserve to choose where it is stored, whether it is shared with some third party, and how it’s transmitted over the internet.

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding. It asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment. This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned. Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment. In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger. It’s important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

Highlights from Git 2.54
  • Rewrite history with git history
  • Config-based hooks
  • Geometric repacking during maintenance by default

    The first feature is great, but the second one is amazing!

Spring creator wants Java’s type system to tame agentic AI

Embabel treats LLMs as participants in strongly typed workflows — not black boxes — and the Spring creator Rod Johnson says that gives Java developers an edge Python can’t match.

How to Write Your First OpenTelemetry Declarative Config File with Trace

Here’s the final sample. That’s not your usual configuration-by-environment-variables:

otel-config.yaml - Full three-signal configuration
file_format: "0.3"
disabled: false

resource:
  attributes:
    service.name: "order-service"
    service.version: "1.8.0"
    deployment.environment: "staging"
    service.namespace: "ecommerce"

tracer_provider:
  processors:
    - batch:
        schedule_delay: 5000
        max_queue_size: 2048
        max_export_batch_size: 512
        exporter:
          otlp:
            endpoint: "http://otel-collector:4317"
            protocol: "grpc"
            compression: "gzip"
  sampler:
    parent_based:
      root:
        trace_id_ratio_based:
          ratio: 0.25
  limits:
    attribute_count_limit: 128
    event_count_limit: 128
    link_count_limit: 128

meter_provider:
  readers:
    - periodic:
        interval: 30000
        timeout: 15000
        exporter:
          otlp:
            endpoint: "http://otel-collector:4317"
            protocol: "grpc"
            temporality_preference: "delta"

logger_provider:
  processors:
    - batch:
        schedule_delay: 5000
        max_queue_size: 2048
        exporter:
          otlp:
            endpoint: "http://otel-collector:4317"
            protocol: "grpc"

propagator:
  composite: [tracecontext, baggage]
I Used to Love Coding. Now I Just Prompt

I already had read a couple of such post. After a couple of months, I understand the feeling; I feel exactly the same and it isn’t fulfilling.

Throwable.initCause(Throwable)

After more than 20 years working with Java, I learn APIs introduced in JDK 1.4.

You can chain exceptions with Throwable.initCause(), even if the designer didn’t provide a relevant constructor.

OpenAPI 3 → idiomatic Kotlin

Generate null-safe Kotlin models, HTTP clients, and server controllers directly from your OpenAPI 3 spec. Wire it into your build once and your code and contract stay in sync as your API evolves — no manual updates, no drift.

Reducing our monorepo size to improve developer velocity

The issue was how Git decides which files are similar enough to compare. By default, it uses a heuristic based on only the last 16 characters of the file path when pairing files for delta compression. In many codebases, that’s good enough. Files with similar names often contain related content.

Macro unreachable

TIL:

Indicates unreachable code.

This is useful any time that the compiler can’t determine that some code is unreachable. For example:

  • Match arms with guard conditions.
  • Loops that dynamically terminate.
  • Iterators that dynamically terminate.
A collection of principles and patterns that shape software systems, teams, and decisions.

Filter by category:

  • Architecture
  • Teams
  • Planning
  • Quality
  • Scale
  • Design
  • Decisions
Inside GitHub’s Fake Star Economy

Six million fake stars, $0.06 per click, and a VC funding pipeline that treats GitHub popularity as proof of traction. We ran our own analysis on 20 repos and found the fingerprints.

Table 1. The manipulated: blockchain repos

Metric

Union Labs (74K)

Shardeum (32K)

FreeDomain (157K)

Anoma (34K)

Median account age

1,180d

997d

1,042d

1,071d

Zero public repos

32.7%

38.0%

28.0%

35.3%

Zero followers

52.0%

59.3%

81.3%

62.0%

Ghost accounts

19.3%

28.7%

28.0%

26.7%

Fork-to-star ratio

0.052

0.022

0.017

0.121

Watcher-to-star ratio

0.022

0.009

0.001

0.006

Jujutsu megamerges for fun and profit

Jujutsu megamerges are super cool and let you work on many different streams of work simultaneously. Read the whole article for an in-depth explanation of how they work. For a super ergonomic setup, add these to your config with jj config edit --user:

[revset-aliases]
"closest_merge(to)" = "heads(::to & merges())"

[aliases]
# `jj stack <revset>` to include specific revs
stack = ["rebase", "--after", "trunk()", "--before", "closest_merge(@)", "--revision"]

# `jj stage` to include the whole stack after the megamerge
stage = ["stack", "closest_merge(@)+:: ~ empty()"]

# `jj restack` to rebase your changes onto `trunk()`
restack = ["rebase", "--onto", "trunk()", "--source", "roots(trunk()..) & mutable()", "--simplify-parents"]
What is a Raspberry Pi? A Complete Beginner’s Guide

It’s a bit less that a complete beginner’s guide, but it explains:

  • That it’s a computer, albeit a small one
  • What is a micro-controller
  • How microcomputers and micro-controllers are different
  • That Raspberry Pi Pico is a micro-controller
  • What is the Raspberry Pi foundation