Yes. Register for an API key at sentimentwiki.io/api/register. Rate limits apply to free tier.

Create an account and visit /contribute/label. Label financial headlines for specific securities. Your labels are reviewed by the community and contribute to the inversion catalog.

Frequently Asked Questions

Honest answers about what SentimentWiki is, how it works, and what it isn't yet.

Why all this effort?

The inversion catalog is the visible part. The real goal is to build the training data flywheel for domain-specific, per-asset financial NLP models — one fine-tuned adapter per security, trained on community consensus labels. Every headline labeled, every inversion confirmed, every phrase annotated is a data point toward a model that genuinely understands what news means for a specific market. The catalog is how we get there.

What is it?

What is SentimentWiki?

An open, community-maintained catalog of financial sentiment inversions — phrases where a generic NLP model predicts the wrong direction for a specific asset. The catalog powers a free public API that returns asset-specific sentiment (direction, magnitude, relevance, reasoning) for any headline you submit.

What is a sentiment inversion?

A phrase where the naive sentiment direction is wrong for a specific asset. "OPEC cuts output" reads as negative (cutting = bad). For crude oil it's bullish — less supply means higher prices. Generic models don't know the asset. The catalog does.

What is Aspect-Based Sentiment Analysis (ABSA) in finance?

ABSA is the NLP task of assigning sentiment not to a document as a whole, but to a specific entity or aspect within it. In finance, the "aspect" is the asset — the same headline can be bullish for oil and bearish for airlines. swik implements financial ABSA by combining a base language model with an asset-specific inversion catalog that encodes how phrases map to price direction for each security.

What is context-dependent polarity in financial NLP?

Context-dependent polarity means that the sentiment direction of a phrase depends on which asset you are analyzing. "Russia sanctions" is negative in general language, but bullish for aluminum (Russia is a major supplier — sanctions restrict supply). "Power shortage" sounds bad, but is bullish for aluminum prices (smelters shut down, reducing supply). Generic models assign fixed polarity; swik assigns asset-specific polarity via the inversion catalog.

How does cross-asset spillover analysis work in swik?

Some events affect multiple assets in different directions. A US dollar rally is typically bearish for gold (priced in USD) but can be bullish or neutral for other commodities depending on their supply dynamics. swik's catalog encodes these cross-asset relationships as inversion entries — you can query the same headline against multiple securities to map spillover effects. Browse the inversion catalog to explore these relationships.

How does swik compare to FinBERT or FinGPT?

They're complementary, not competing. FinBERT gives you strong base polarity for financial text. FinGPT extends LLMs for a range of financial tasks. swik adds the layer neither solves: asset-specific inversion context. A generic model reads "OPEC cuts production" as negative. swik knows it's bullish for crude oil. Use them together — swik is designed to work as a second pass on top of any base model output, correcting for the systematic inversions that domain-general models consistently miss.

Why did you build this?

I kept seeing FinBERT misread crude oil headlines in ways that were obvious to anyone who understands the market. The fix isn't a better generic model — it's a catalog of what phrases mean for each specific asset. I figured the community was better positioned to build that than any single model.

How does the API work?

How does the inference work?

Two layers: Claude Haiku provides base sentiment, then re-evaluates with the inversion catalog for your specific security injected as context. The catalog tells it what phrases mean for that asset — so "inventory draw" maps to bullish for OIL even though it sounds negative.

Is the API free?

Yes — 100 requests/day for anonymous users, no signup required. Above that, contact us directly. There's no self-serve paid tier yet. At this stage I want to talk to anyone who needs more than 100/day.

What securities are supported?

35+ assets across energy (OIL, NATGAS, LNG, BRENT), metals (GOLD, SILVER, COPPER, PLATINUM, PALLADIUM), agriculture (WHEAT, CORN, SOYBEANS, SUGAR, COFFEE, COTTON), forex (EURUSD, GBPUSD, USDJPY, USDCAD, USDCHF, AUDUSD), crypto (BTC, ETH), equity indices, and macro. See the full catalog.

What happens if I submit a security not in the catalog?

Returns a 404. We don't do generic sentiment for out-of-catalog assets — the inversion awareness is the value, and without a catalog entry we can't provide it accurately. You can request a new security.

The catalog

How do you know the catalog entries are correct?

Multi-layer consensus: AI-generated hypotheses are seeded first, then community members confirm or reject them. A hypothesis requires 3+ confirms at a 2:1 confirm/reject ratio to become active. Maintainers can lock consensus entries. New accounts go through a labeling CAPTCHA on signup. That said — the community is young and most entries are still hypotheses awaiting human validation. That's partly why we launched publicly.

Is the catalog open?

Yes — CC BY 4.0. Available on GitHub and HuggingFace. The catalog stays open regardless of what happens to the platform.

What stops someone from polluting the catalog with bad data?

The consensus threshold makes it expensive — you need 3+ independent confirms at 2:1 ratio. Maintainers review and can lock entries. Abnormal voting patterns get flagged. It's not bulletproof, especially with a young community, but coordinated attacks are costly relative to what an attacker gains. We'll harden this as the community grows.

Business & roadmap

How do you make money?

Not yet. The catalog needs depth before the API is worth paying for. The model is open catalog → free API tier → paid tiers for high-volume commercial use once the per-security model adapters are ready. The catalog stays open regardless.

Won't a well-funded competitor just copy this?

They can copy the infrastructure in a week. They can't copy community consensus from domain experts across 35 asset classes. Every label submitted makes the next model better, which attracts more contributors. The catalog is the moat, not the code.

What's the roadmap?

Short term: deepen the catalog, grow the contributor community, publish an accuracy benchmark. Medium term: LoRA fine-tuned adapters per security, trained on community consensus labels — one small model per asset, fully self-hostable. Long term: paid API tiers for high-volume users, Python SDK, arXiv benchmark paper.

Are you using Claude/Anthropic? What happens if they change pricing?

Claude Haiku is the current inference layer, not the endgame. The roadmap is LoRA fine-tuned adapters per security — fully self-hostable, no API dependency. Haiku is cheap enough right now that it's not a business risk at current traffic levels. If pricing becomes a problem, there are open alternatives. The catalog is the asset, not the inference engine.

Contributing

How do I contribute?

Three ways: label headlines in the label queue, highlight phrases in the articles tab of any security, or vote on inversion hypotheses on any security page. No domain expertise required — if you know what a headline means for a market, your label is valuable.

Do I need an account?

For the API, no — anonymous up to 100/day. For labeling and voting, yes — creates a contributor record so your labels build reputation over time.

Something not answered here? Email multidude@sentimentwiki.io or open an issue on GitHub.