VCM.fyi
Open Infrastructure for Voluntary Carbon Market Accountability
Funding Memo for Philanthropic Partners
Executive Summary
VCM.fyi is not a proposal—it's a working system. The data explorer at app.vcm.fyi already ingests daily transaction data from seven major carbon registries, normalizes it into a unified schema, and serves it through a production API.
The voluntary carbon market lacks reliable, open infrastructure. Journalists, researchers, and NGOs either pay $25,000+/year for proprietary platforms or spend weeks scraping fragmented registry data manually. VCM.fyi solves this by providing the same caliber of market intelligence—free, auditable, and maintained.
I am seeking $250K–$400K over 12 months (or $500K–$800K over 18 months for expanded scope) to open-source the complete system, publish quarterly datasets, and maintain public API access for the accountability ecosystem.
This is not speculative. The hard engineering work is done. Funding sustains and opens what already exists.
About the Founder
Mahmoud Mobir — I spent four years at Rhodium Group, a leading independent research firm focused on climate and energy policy. At Rhodium, I built internal data systems and analytical tools used across the firm's climate practice, including work supporting the Climate Trace coalition and various client engagements on carbon markets and energy transition analysis.
I built VCM.fyi end-to-end as a solo technical effort: data ingestion pipelines, API backend, web application, and infrastructure. This is not my first time building reliable data systems—it's what I've done professionally for years.
References available from former colleagues in climate data and research.
What Exists Today
The following components are live and running in production:
| Component | Status | Details |
|---|---|---|
| Data Ingestion Pipeline | Running daily | Python-based collectors for 7 registries, scheduled cron jobs, Parquet archival |
| Production API | Live | FastAPI backend hosted on Fly.io, PostgreSQL database, authenticated endpoints |
| Web Application | Live | Next.js app on Vercel at app.vcm.fyi with search, filters, project views |
| Buyer Intelligence | Live | 500+ corporate buyers identified, retirement patterns, counterparty matching |
| Project News Coverage | Live | Per-project aggregation of public articles for credibility/controversy signals |
| Registry Coverage | Complete | Verra VCS, Gold Standard, CAR, ACR, ART TREES, Isometric, Puro.Earth |
Total data coverage: 11,000+ projects, 2B+ credits tracked, 500+ identified corporate buyers, daily refresh cycle.
Development Timeline
This project has been in active development, not just planning:
The Problem
Voluntary carbon markets trade over $1 billion annually, yet the data infrastructure underpinning them remains fragmented, proprietary, and expensive. Each registry publishes data in incompatible formats with inconsistent update cadences. There is no unified identifier scheme.
This creates three concrete harms:
- Duplication of effort: Dozens of organizations independently build and maintain registry scrapers. A Guardian journalist, a Berkeley researcher, and an NGO analyst are all solving the same data-cleaning problems in parallel.
- Asymmetric access: Commercial platforms like AlliedOffsets, Sylvera, and BeZero charge $15,000–$50,000 per seat annually. Well-funded buyers have market intelligence; watchdogs and researchers do not.
- Opacity favoring bad actors: When verifying corporate claims is expensive and time-consuming, low-quality credits and greenwashing thrive.
Why now: With Article 6 operationalizing, ICVCM integrity standards rolling out, and increasing regulatory scrutiny of corporate climate claims, the need for independent, verifiable market data has never been more urgent.
Why This Isn't "Just a Spreadsheet"
When I explored commercial interest, prospects often said "I can do this in Excel." They can't—not reliably, not at scale, not with daily updates. Here's why:
- Registry format changes: Verra changed their API structure twice this year. Gold Standard's export format differs by project type. Manual scrapers break constantly.
- Entity matching: "Shell plc," "Shell International," and "Shell Energy" across five registries must be resolved to a single buyer profile. This requires sustained engineering, not a VLOOKUP.
- Historical depth: Understanding market trends requires daily snapshots archived over time. Ad-hoc efforts lack this institutional memory.
- Maintenance burden: Registry coverage isn't a one-time effort—it's ongoing operational work that compounds.
Commercial equivalence: The system I've built delivers coverage comparable to platforms charging $25,000+/year. The difference: open methodology, auditable code, and no paywall.
Who This Serves
| User | Use Case |
|---|---|
| Investigative journalists | Verify corporate offset claims; identify patterns of questionable credit purchases; produce accountability stories with evidence. |
| Academic researchers | Analyze market trends, credit quality, and retirement behavior with clean, reproducible datasets—without weeks of data wrangling. |
| Climate NGOs & watchdogs | Monitor corporate greenwashing claims; benchmark company commitments against actual retirements. |
| Responsible buyers | Evaluate project credibility quickly; identify controversy signals before procurement decisions. |
| Policy analysts | Understand market structure, credit flows, and registry dynamics to inform standards and regulation. |
What Funding Enables
The system works. Funding enables me to open it to everyone and keep it running reliably:
- Open-source release: Publish the complete codebase, data model, and ingestion pipeline under Apache 2.0. Anyone can audit, fork, or extend.
- Quarterly public datasets: Versioned data releases in researcher-friendly formats (Parquet, CSV) with documentation.
- Free public API: Programmatic access for researchers, journalists, and NGOs—no paywall, no account required for basic access.
- Improved entity resolution: Better buyer/beneficiary matching across registries for more accurate market participant analysis.
- Expanded credibility layer: Deeper integration of NGO assessments, third-party ratings, and news coverage into project profiles.
- Documentation & onboarding: User guides, API docs, Jupyter notebooks to reduce adoption friction.
- Sustained maintenance: Daily pipeline runs, registry format updates, bug fixes, uptime monitoring.
Budget
| Scenario | Duration | Amount | Scope |
|---|---|---|---|
| Lean | 12 months | $250,000–$400,000 | Open-source release, public API, quarterly datasets, core maintenance, contractor engineering support. |
| Robust | 18 months | $500,000–$800,000 | All of the above plus: enhanced entity resolution, expanded credibility layer, part-time data quality support, community governance, outreach/partnerships. |
Illustrative allocation:
- Engineering contractors (pipeline maintenance, features): 50–55%
- Infrastructure (Fly.io, Vercel, database, monitoring): 10–15%
- Documentation, UX, design: 10–12%
- Outreach, partnerships, conferences: 5–8%
- Fiscal sponsor fees (typically 8–10%): 8–10%
- Contingency: 5%
Note: I am currently on an H-1B visa, which constrains personal compensation structures in the initial period. Early funding may prioritize contractor and infrastructure costs. I am committed to structuring any founder compensation in full compliance with immigration requirements as the organization formalizes. Happy to discuss specifics.
Operating Model
Fiscal sponsorship (preferred initial structure):
VCM.fyi does not currently have 501(c)(3) status. To receive grants quickly and compliantly, I am pursuing fiscal sponsorship with an established sponsor. This enables immediate grant acceptance while deferring the overhead of independent nonprofit formation.
Potential sponsors under consideration: Open Collective Foundation, Hack Club, NumFOCUS, or climate-aligned fiscal sponsors like Windward Fund.
Transparency commitments:
- All code and pipelines publicly auditable on GitHub
- Quarterly transparency reports: funding use, system uptime, data quality metrics
- Published methodology documentation with known limitations
- Advisory input from NGO and research partners on roadmap priorities
Data sourcing & compliance:
- All data sourced from publicly available registry APIs and websites
- Pipeline respects rate limits and published terms of service
- No credential-based or authenticated scraping
- Legal review of registry ToS on roadmap
Risks & Mitigations
I want to be upfront about what could go wrong:
| Risk | Mitigation |
|---|---|
| Registry format changes break ingestion | Modular architecture isolates each registry. Automated tests detect schema drift. I've already handled multiple Verra format changes this year. |
| Entity matching errors | Published matching methodology. User-facing confidence flags. Community feedback loop for corrections. No claim of perfection—transparent about limitations. |
| Registry blocks access | Current approach uses only public APIs and published data. Architecture avoids aggressive scraping. Building relationships with registry operators is on the roadmap. |
| Sustainability beyond grant | Open-source release ensures community can maintain even without continued funding. Exploring tiered model for enterprise users. Seeking multi-year or renewable funding. |
| Founder bandwidth / single point of failure | Funding enables contractor support. Documented codebase. Fiscal sponsor provides operational backstop. Open-source means others can maintain. |
What This Is NOT
To be clear about scope and intentions:
- Not a startup seeking growth capital. I'm not building to flip or scale to 100 employees. This is public infrastructure.
- Not a rating agency. VCM.fyi provides data and aggregated signals, not "this project is good/bad" judgments.
- Not competing with registries. We aggregate and normalize their data to make it accessible—we're not replacing registry functions.
- Not promising to solve carbon market integrity. Better data is necessary but not sufficient. This is infrastructure that enables accountability work by others.
Next Steps
I'm requesting a 30-minute introductory call to discuss alignment with your priorities and answer questions.
What I can share immediately:
- Live walkthrough of app.vcm.fyi—you can see the data, filters, buyer views, project coverage
- Technical architecture overview (happy to go as deep as useful)
- Sample data exports
- References from former colleagues at Rhodium Group and in climate research
Mahmoud Mobir
Appendix
Technical Architecture
- Ingestion: Python-based collectors for each registry, scheduled cron jobs on cloud infrastructure
- Storage: Parquet files for archival/snapshots; PostgreSQL for production API
- API: FastAPI backend hosted on Fly.io with authenticated endpoints
- Frontend: Next.js application deployed on Vercel
- Normalization: Unified schema across registries with consistent field names and types
- Monitoring: Automated health checks, ingestion success tracking, error alerting
Current Data Coverage
- Registries: Verra VCS, Gold Standard, CAR, ACR, ART TREES, Isometric, Puro.Earth
- Projects: 11,000+ across all registries
- Credits tracked: 2B+ issued credits
- Buyers identified: 500+ corporate entities from retirement records
- Update frequency: Daily automated runs
12-Month Success Metrics
| Metric | Target |
|---|---|
| System uptime | ≥99.5% |
| Daily ingestion success rate | ≥98% per registry |
| Public dataset releases | 4 quarterly releases |
| Documented API users | 50+ researchers/NGOs/journalists |
| Partner organizations citing VCM.fyi | 10+ |
| Open-source repository engagement | Stars/forks baseline established |
| Media/research citations | 5+ published references |