Multi-Language Support in VSCode Text Documents

VSCode does not allow setting several languages simultaneously for a single document.

Alena Khineika
Better Programming

--

Extension authors can open a text document for a specific language:

await vscode.workspace.openTextDocument({ language: 'mongodb', content });

Or set the language for an existing text document:

await vscode.languages.setTextDocumentLanguage(document, 'javascript');

But there are many use cases when a document should contain several languages at the same time like HTML and CSS, or extend an existing language with an additional syntax, as in the case of TypeScript and JavaScript.

If you build functionality on top of existing language, you might want to write only custom features and reuse those that are already available.

VSCode provides first-class features support for common languages like JavaScript: syntax highlighting, variable, function, and expression autocomplete, code refactoring, linting, and code formatting.

If you want to extend such language as JavaScript you basically have two options:

  • Design your extension to work with the JavaScript language adopting all its language features and extending them where applicable.
  • Or create a custom language with total control over its features, but if you need anything from the JavaScript feature set, you will add it yourself.

In the first scenario, you can rely on existing JS features and avoid the hassle of re-implementing them. However, this approach may limit your customization options and even cause unexpected behavior, such as syntax highlighting rules not working as you expect.

Syntax Highlighting

VSCode implements Syntax Highlighting via TextMate grammars. You can write grammar from scratch or inject additional syntax into one of the existing ones.

Injection grammars are contributed through the package.json where injectTo specifies a target language.

"grammars": [{
"path": "./syntaxes/mongodb.tmLanguage.json",
"scopeName": "mongodb.injection",
"injectTo": ["source.js"]
}]

Using the scope inspector, you can find an injection selector to specify which scope the injected grammar should be applied in.

{
"scopeName": "mongodb.injection",
"injectionSelector": "L:meta.objectliteral.js",
"patterns": [{
"include": "#object-member"
}],
"repository": {
"object-member": {
"patterns": [{
"name": "meta.object.member.mongodb",
"match": "\\$match\\b",
"captures": { "0": { "name": "keyword.other.match.mongodb" } }
}]
}
}
}

The L: in the injection selector means that the injection is added to the left of existing grammar rules, i.e. injected grammar rules will be applied before any existing grammar rules.

By introducing a new $match pattern, I sought to extend the capabilities of the JavaScript language and assign a distinct color to the MongoDB keyword to set it apart from other object properties. However, my effort was hindered by the limitations of semantic JavaScript highlighting, which automatically overrides any customizations made to original tokens.

“TM tokens can only be an approximation of the parsed language/AST and cannot benefit from semantic analysis. That is why we have made the design decision that semantic tokens always overwrite TM tokens. The overwrite works like a mask, so if a range is not covered by a semantic token, the TM token is used instead.”

If you try this example, you can notice that the $match keyword is being colored for a few seconds and then semantic highlighting changes the color of the syntax.

Currently, there is no other way to resolve this except to turn off semantic highlighting for your current workspace, or for all themes you use.

"editor.semanticHighlighting.enabled": false

If you’re not okay with this solution, using just JavaScript may not be enough for your extension and you should consider adding support for a custom language.

Contributing a custom language

The languages contribution point of package.json allows you to define a new language id and associate it with a custom grammar.

"contributes": {
"languages": [{
"id": "mongodb",
"aliases": ["MongoDB", "mongodb"],
"extensions": [".mongodb"]
}],
"grammars": [{
"language": "mongodb",
"scopeName": "source.mongodb",
"path": "./syntaxes/mongodb.tmLanguage.json"
}]
}

The MongoDB-TmLanguage grammar is derived from TypeScript-TmLanguage by extending its syntax with MongoDB keywords to provide custom syntax highlighting.

If you switch the document to the JavaScript language you will lose $match, $group, $sum, etc. keywords highlighting.

Technically, you could apply the MongoDB-TmLanguage grammar to the JavaScript language id, but you probably don’t want to overwrite the native JavaScript syntax highlighting for all JavaScript files opened in VSCode.

Grammars are only responsible for syntax highlighting.

However, developers often expect more from the language support, e.g. variable, function, and expression autocomplete, code refactoring, linting, and code formatting. It is possible to implement programmatic language features by using languages.* API or benefit from tooling provided by the Language Server.

In this article, we will build a simple Language Server Extension that provides both MongoDB and JavaScript completions for a document opened with the MongoDB language. IntelliSense is an essential part of any IDE and it plays a significant role in providing a seamless and productive development experience for users.

One document two languages

The source code is available at alenakhineika/vscode-js-languageservice-sample. The remaining article assumes that you are familiar with VSCode Extension API.

When the Language Server is initialized, it registers the onCompletion method, which is called by the client each time when a user types a trigger character to request completion suggestions.

// server/src/server.ts
import LanguageService from './tsLanguageService';

// The TypeScript language service.
const tsLanguageService = new LanguageService();

connection.onInitialize((params: InitializeParams) => {
const capabilities = params.capabilities;
return {
capabilities: {
textDocumentSync: TextDocumentSyncKind.Incremental,
// Tell the client that the server supports code completion.
completionProvider: {
resolveProvider: true,
triggerCharacters: ['.'],
},
},
};
});

// Provide completion items.
connection.onCompletion(async (params: TextDocumentPositionParams) => {
const document = documents.get(params.textDocument.uri);

if (!document) {
return [];
}

return tsLanguageService.doComplete(document, params.position);
});

In the custom LanguageService class, we repeat the completion logic from the TypeScript extension bundled into VSCode.

// server/src/tsLanguageService.ts
import * as ts from 'typescript';
import { CompletionItem } from 'vscode-languageserver/node';
import { TextDocument } from 'vscode-languageserver-textdocument';

import { loadLibrary } from './loadLibrary';
import { convertKind } from './convertKind';

type JavascriptServiceHost = {
getLanguageService(jsDocument: TextDocument): ts.LanguageService;
getCompilationSettings(): ts.CompilerOptions;
dispose(): void;
};

export default class LanguageService {
_host: JavascriptServiceHost;

constructor() {
this._host = this._getJavaScriptServiceHost();
}

_getJavaScriptServiceHost() {
const compilerOptions = {
allowNonTsExtensions: true,
allowJs: true,
target: ts.ScriptTarget.Latest,
moduleResolution: ts.ModuleResolutionKind.Classic,
experimentalDecorators: false,
};
let currentTextDocument = TextDocument.create('init', 'javascript', 1, '');

const host: ts.LanguageServiceHost = {
getCompilationSettings: () => compilerOptions,
getScriptFileNames: () => [
currentTextDocument.uri,
'global.d.ts', // Note this line!
],
getScriptKind: () => ts.ScriptKind.JS,
getScriptVersion: (fileName: string) => {
if (fileName === currentTextDocument.uri) {
return String(currentTextDocument.version);
}
return '1';
},
getScriptSnapshot: (fileName: string) => {
let text = '';
if (fileName === currentTextDocument.uri) {
text = currentTextDocument.getText();
} else {
text = loadLibrary(fileName);
}
return {
getText: (start, end) => text.substring(start, end),
getLength: () => text.length,
getChangeRange: () => undefined,
};
},
getCurrentDirectory: () => '',
getDefaultLibFileName: () => 'lib.es2022.full.d.ts',
readFile: (): string | undefined => undefined,
fileExists: (): boolean => false,
directoryExists: (): boolean => false,
};

const languageService = ts.createLanguageService(host);

return {
// Return a language service instance for a document.
getLanguageService(jsDocument: TextDocument): ts.LanguageService {
currentTextDocument = jsDocument;
return languageService;
},
getCompilationSettings() {
return compilerOptions;
},
dispose() {
languageService.dispose();
},
};
}

async doComplete(
document: TextDocument,
position: { line: number; character: number },
) {
const jsDocument = TextDocument.create(
document.uri,
'javascript',
document.version,
document.getText()
);
const languageService = await this._host.getLanguageService(jsDocument);
const offset = jsDocument.offsetAt(position);
const jsCompletion = languageService.getCompletionsAtPosition(
jsDocument.uri,
offset,
{
includeExternalModuleExports: false,
includeInsertTextCompletions: false,
}
);

return jsCompletion?.entries.map((entry) => {
const data = {
languageId: 'javascript',
uri: document.uri,
offset: offset
};
return {
uri: document.uri,
position: position,
label: entry.name,
sortText: entry.sortText,
kind: convertKind(entry.kind),
data
};
}) || [];
}
}

The getCompilationSettings method defines options to be used during the compilation process. These options include things like the target language version, the module format, and other settings.

The getScriptKind method could analyze the current file to identify if this is TypeScript, JavaScript, JSON, and so on. But we return ts.ScriptKind.JS directly indicating that the current script is always a JavaScript file.

The getDefaultLibFileName method specifies where TypeScript default definitions can be found.

The getScriptFileNames is the most interesting method. We can leverage it to specify additional sources of type definitions, that will be processed by the TypeScript service for resolving completion items, help signatures, and other language features.

// global.d.ts declares the custom methods.
declare global {
let mongodbMethod: (dbName: string) => void;
}
export {};

Now, if you trigger IntelliSense, you will see not only relevant JavaScript completions but also our MongoDB method 🎉

The ability to reuse the TypeScript language service significantly simplifies the custom autocomplete. Otherwise, you would probably use the @babel/parser or any other parser to analyze and tokenize the source code and generate a list of completion items appropriate for the current context based on the trigger character position.

import type * as babel from '@babel/core';
import * as parser from '@babel/parser';
import traverse from '@babel/traverse';

let isGlobalSymbol;

const visitExpressionStatement = (path: babel.NodePath) => {
if (
path.node.type === 'ExpressionStatement' &&
path.node.expression.type === 'Identifier' &&
path.node.expression.name.includes('TRIGGER_CHARACTER') &&
) {
isGlobalSymbol = true;
}
}

const parseAST = ({ textFromEditor, selection }) => {
let ast;
try {
ast = parser.parse(textFromEditor, {
// Parse in strict mode and allow module declarations.
sourceType: 'module',
});
} catch (error) { /** Handle error */ }

traverse(ast, {
enter: (path: babel.NodePath) => {
visitExpressionStatement(path);
},
});
}

Luckily, we don’t need to do this manually and can delegate the work to the TypeScript service.

You can further experiment with it and add more language features to the extension. For example, you can use the existing language service configuration to provide help signatures for methods declared in the global.d.ts.

// server/src/tsLanguageService.ts
doSignatureHelp(
document: TextDocument,
position: Position
): Promise<SignatureHelp | null> {
const jsDocument = TextDocument.create(
document.uri,
'javascript',
document.version,
document.getText()
);
const languageService = this._host.getLanguageService(jsDocument);
const signHelp = languageService.getSignatureHelpItems(
jsDocument.uri,
jsDocument.offsetAt(position),
undefined
);

if (signHelp) {
const ret: SignatureHelp = {
activeSignature: signHelp.selectedItemIndex,
activeParameter: signHelp.argumentIndex,
signatures: [],
};
signHelp.items.forEach((item) => {
const signature: SignatureInformation = {
label: '',
documentation: undefined,
parameters: [],
};

signature.label += ts.displayPartsToString(item.prefixDisplayParts);
item.parameters.forEach((p, i, a) => {
const label = ts.displayPartsToString(p.displayParts);
const parameter: ParameterInformation = {
label: label,
documentation: ts.displayPartsToString(p.documentation),
};
signature.label += label;
signature.parameters?.push(parameter);
if (i < a.length - 1) {
signature.label += ts.displayPartsToString(
item.separatorDisplayParts
);
}
});
signature.label += ts.displayPartsToString(item.suffixDisplayParts);
ret.signatures.push(signature);
});
return Promise.resolve(ret);
}
return Promise.resolve(null);
}

You should also tell the VSCode language client that the server supports help signatures.

connection.onInitialize((params: InitializeParams) => {
const capabilities = params.capabilities;
return {
capabilities: {
textDocumentSync: TextDocumentSyncKind.Incremental,
// Tell the client that the server supports code completion.
completionProvider: {
resolveProvider: true,
triggerCharacters: ['.'],
},
signatureHelpProvider: {
resolveProvider: true,
triggerCharacters: [',', '('],
},
},
};
});

Now, when you open a bracket after the mongodbMethod method, VSCode will show you the method documentation.

The VSCode Embedded Languages API can make the Language Server even smarter and break a document down into language regions, and use the corresponding language service to handle language server requests. For example, the vscode-css-languageservice provides support for CSS in HTML files. For areas that start with <|, the service provides HTML completions, and inside <style>.foo { | }</style> blocks it completes CSS.

I find it helpful to look at VSCode and TypeScript source code to understand how API is built and in which ways it can be used. You cannot always do what bundled into VSCode extensions have access to. In any case, this is an excellent learning resource that effectively clarifies many areas of confusion.

--

--