EmDash’s import system uses a pluggable source architecture. Each source knows how to probe, analyze, and fetch content from a specific platform.
Import Sources
| Source ID | Platform | Probe | OAuth | Full Import |
|---|---|---|---|---|
wxr | WordPress export file | No | No | Yes |
wordpress-com | WordPress.com | Yes | Yes | Yes |
wordpress-rest | Self-hosted WordPress | Yes | No | Probe only |
WXR File Upload
The most complete import method. Upload a WordPress eXtended RSS (WXR) export file directly to the admin dashboard.
Capabilities:
- All post types (including custom)
- All meta fields
- Drafts and private posts
- Full taxonomy hierarchy
- Media attachment metadata
How to get a WXR file:
- In WordPress admin, go to Tools → Export
- Select All content or specific post types
- Click Download Export File
- Upload the
.xmlfile to EmDash
WordPress.com OAuth
For sites hosted on WordPress.com, connect via OAuth to import without manual file exports.
- Enter your WordPress.com site URL
- Click Connect with WordPress.com
- Authorize EmDash in the WordPress.com popup
- Select content to import
What’s included:
- Published and draft content
- Private posts (with authorization)
- Media files via API
- Custom fields exposed to REST API
WordPress REST API Probe
When you enter a URL, EmDash probes the site to detect WordPress and show available content:
Detected: WordPress 6.4
├── Posts: 127 (published)
├── Pages: 12 (published)
└── Media: 89 files
Note: Drafts and private content require authentication
or a full WXR export.
The REST probe is informational. For complete imports, it suggests uploading a WXR file or connecting via OAuth (for WordPress.com).
Import Flow
All sources follow the same flow:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Connect │────▶│ Analyze │────▶│ Prepare │────▶│ Execute │
│ (probe/ │ │ (schema │ │ (create │ │ (import │
│ upload) │ │ check) │ │ schema) │ │ content) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Step 1: Connect
Enter a URL to probe or upload a file directly.
URL probing runs all registered sources in parallel. The highest-confidence match determines the suggested next action:
- WordPress.com site → Offer OAuth connection
- Self-hosted WordPress → Show export instructions
- Unknown → Suggest file upload
Step 2: Analyze
The source parses content and checks schema compatibility:
Post Types:
├── post (127) → posts [New collection]
├── page (12) → pages [Existing, compatible]
├── product (45) → products [Add 3 fields]
└── revision (234) → [Skip - internal type]
Required Schema Changes:
├── Create collection: posts
├── Add fields to pages: featured_image
└── Create collection: products
Each post type shows its status:
| Status | Meaning |
|---|---|
| Ready | Collection exists with compatible fields |
| New collection | Will be created automatically |
| Add fields | Collection exists, missing fields added |
| Incompatible | Field type conflicts (manual fix needed) |
Step 3: Prepare Schema
Click Create Schema & Import to:
- Create new collections via SchemaRegistry
- Add missing fields with correct column types
- Set up content tables with indexes
Step 4: Execute Import
Content imports sequentially:
- Gutenberg/HTML converted to Portable Text
- WordPress status mapped to EmDash status
- WordPress authors mapped to ownership (
authorId) and presentation bylines - Taxonomies created and linked
- Reusable blocks (
wp_block) imported as Sections - Progress shown in real-time
Author import behavior:
- If an author mapping points to an EmDash user, ownership is set to that user and a linked byline is created/reused for the same user.
- If there is no user mapping, a guest byline is created/reused from the WordPress author identity.
- Imported entries get ordered byline credits, with the first credit set as
primaryBylineId.
Step 5: Media Import (Optional)
After content, optionally import media:
-
Analysis — Shows attachment counts by type
Media found: ├── Images: 75 files ├── Video: 10 files └── Other: 4 files -
Download — Streams from WordPress URLs with progress
Importing media... ├── 45 of 89 (50%) ├── Current: vacation-photo.jpg └── Status: Uploading -
Rewrite URLs — Content automatically updated with new URLs
Media import uses content hashing (xxHash64) for deduplication. The same image used in multiple posts is stored once.
Source Interface
Import sources implement a standard interface:
interface ImportSource {
/** Unique identifier */
id: string;
/** Display name */
name: string;
/** Probe a URL (optional) */
probe?(url: string): Promise<SourceProbeResult | null>;
/** Analyze content from this source */
analyze(input: SourceInput, context: ImportContext): Promise<ImportAnalysis>;
/** Stream content items */
fetchContent(input: SourceInput, options: FetchOptions): AsyncGenerator<NormalizedItem>;
}
Input Types
Sources accept different input types:
// File upload (WXR)
{ type: "file", file: File }
// URL with optional token (REST API)
{ type: "url", url: string, token?: string }
// OAuth connection (WordPress.com)
{ type: "oauth", url: string, accessToken: string }
Normalized Output
All sources produce the same normalized format:
interface NormalizedItem {
sourceId: string | number;
postType: string;
status: "publish" | "draft" | "pending" | "private" | "future";
slug: string;
title: string;
content: PortableTextBlock[];
excerpt?: string;
date: Date;
author?: string;
authors?: string[];
categories?: string[];
tags?: string[];
meta?: Record<string, unknown>;
featuredImage?: string;
}
API Endpoints
The import system exposes these endpoints:
Probe URL
POST /_emdash/api/import/probe
Content-Type: application/json
{ "url": "https://example.com" }
Returns detected platform and suggested action.
Analyze WXR
POST /_emdash/api/import/wordpress/analyze
Content-Type: multipart/form-data
file: [WordPress export .xml]
Returns post type analysis with schema compatibility.
Prepare Schema
POST /_emdash/api/import/wordpress/prepare
Content-Type: application/json
{
"postTypes": [
{ "name": "post", "collection": "posts", "enabled": true }
]
}
Creates collections and fields.
Execute Import
POST /_emdash/api/import/wordpress/execute
Content-Type: multipart/form-data
file: [WordPress export .xml]
config: { "postTypeMappings": { "post": { "collection": "posts" } } }
Imports content to specified collections.
Import Media
POST /_emdash/api/import/wordpress/media
Content-Type: application/json
{
"attachments": [{ "id": 123, "url": "https://..." }],
"stream": true
}
Streams NDJSON progress updates during download/upload.
Rewrite URLs
POST /_emdash/api/import/wordpress/rewrite-urls
Content-Type: application/json
{
"urlMap": { "https://old.com/image.jpg": "/_emdash/media/abc123" }
}
Updates Portable Text content with new media URLs.
Error Handling
Recoverable Errors
- Network timeout — Retried with backoff
- Single item parse failure — Logged, skipped, import continues
- Media download failure — Marked for manual handling
Fatal Errors
- Invalid file format — Import stops with error message
- Database connection lost — Import pauses, allows resume
- Storage quota exceeded — Import stops, shows usage
Error Report
After import:
Import Complete
✓ 125 posts imported
✓ 12 pages imported
✓ 85 media references recorded
⚠ 2 items had warnings:
- Post "Special Characters ñ" - title encoding fixed
- Page "About" - duplicate slug renamed to "about-1"
✗ 1 item failed:
- Post ID 456 - content parsing error (saved as draft)
Failed items are saved as drafts with original content in _importError for review.
Building Custom Sources
Create a source for other platforms:
import type { ImportSource } from "emdash/import";
export const mySource: ImportSource = {
id: "my-platform",
name: "My Platform",
description: "Import from My Platform",
icon: "globe",
canProbe: true,
async probe(url) {
// Check if URL matches your platform
const response = await fetch(`${url}/api/info`);
if (!response.ok) return null;
return {
sourceId: "my-platform",
confidence: "definite",
detected: { platform: "my-platform" },
// ...
};
},
async analyze(input, context) {
// Parse and analyze content
// Return ImportAnalysis
},
async *fetchContent(input, options) {
// Yield NormalizedItem for each content piece
for (const item of items) {
yield {
sourceId: item.id,
postType: "post",
title: item.title,
content: convertToPortableText(item.body),
// ...
};
}
},
};
Register the source in your EmDash configuration:
import { mySource } from "./src/import/custom-source";
export default defineConfig({
integrations: [
emdash({
import: {
sources: [mySource],
},
}),
],
});
Next Steps
- WordPress Migration — Complete WordPress migration guide
- Plugin Porting — Port WordPress plugins to EmDash