Getting started

Get up and running with tangent/ds in minutes.

Table of contents

  1. No install
  2. Installation
    1. Deno
    2. NPM
    3. CDN (Browser)
  3. First analysis
    1. 1. Import the library
    2. 2. Prepare your data
    3. 3. Run the analysis
  4. Core concepts
    1. Declarative API
    2. Fit-Transform Pattern
    3. Integration with Observable Plot
  5. What’s Next?
    1. Learn by Example
    2. Browse the API
    3. Run Interactive Examples
  6. Need Help?
  7. Development Setup

No install

The easiest way to use tangent/ds is to head directly to Observablehq.com, then import the library in a JavaScript.

ds = await import("https://esm.sh/@tangent.to/ds")

If you need some example to kick start your analysis, check out the data science collection Data science with tangent/ds.

Installation

If you prefer to work locally, you can install tangent/ds with deno, npm, or downloading it in the browser on refresh.

Deno

Deno supports Jupyter, the most widely used notebook interface of the multiverse. After installing Deno, write deno jupyter --install in the terminal to install Jupyter, deno add @tangent/ds to install tangent/ds, then launch your app with jupyter.

Once Jupyter is installed, you can also run the notebook in Zed. Create workspace settings (e.g., /.zed/settings.json) to prefer the Deno kernel and enable Deno LSP:

{
  "jupyter": { "kernel_selections": { "typescript": "deno", "javascript": "deno" } },
  "lsp": { "deno": { "settings": { "deno": { "enable": true } } } }
}

In Zed’s deno notebooks, cells are separated by // %% tags as

// %%
import * as aq from "https://esm.sh/arquero@5.3.0";

// %%
const t = aq.table([{a:1},{a:2}]);
console.log(t.toString());

NPM

npm install @tangent/ds

CDN (Browser)

<script type="module">
  import * as ds from 'https://cdn.jsdelivr.net/npm/@tangent/ds/+esm';
</script>

First analysis

Let’s run a simple t-test to compare two groups.

1. Import the library

import * as ds from '@tangent/ds';

2. Prepare your data

const penguinsResponse = await fetch(
  'https://cdn.jsdelivr.net/npm/vega-datasets@2/data/penguins.json',
);
const penguinsData = await penguinsResponse.json();
console.table(penguinsData.slice(0, 5));

3. Run the analysis

const tested_variable = "Body Mass (g)";

const adelie_var = penguinsData
  .filter((d) => d.Species == "Adelie")
  .map((d) => d[tested_variable]);

const chinstrap_var = penguinsData
  .filter((d) => d.Species == "Chinstrap")
  .map((d) => d[tested_variable]);

const ttest = ds.stats.hypothesis.twoSampleTTest(adelie_var, chinstrap_var);

console.log(ttest);
{
  statistic: -18.42963067630639,
  pValue: 0.0020000000000000018,
  df: 274,
  mean1: 3676.315789473684,
  mean2: 5035.080645161291,
  pooledSE: 73.72718854504609,
  alternative: "two-sided"
}

Core concepts

Declarative API

DS uses a declarative approach where you describe your data and analysis:

const penguinsFeatureFields = [
  'Beak Length (mm)',
  'Beak Depth (mm)',
  'Flipper Length (mm)',
  'Body Mass (g)',
];

const pcaData = penguinsData.map(d =>
  penguinsFeatureFields.reduce((row, field) => {
    row[field] = d[field];
    return row;
  }, {})
);

const pca = new ds.mva.PCA({
  center: true,
  scale: true,
  scaling: 2, // correlation biplot
  omit_missing: true
});

pca.fit({data: pcaData});

Fit-Transform Pattern

Many methods follow the fit-transform pattern from scikit-learn:

const penguinsSplit = ds.ml.validation.trainTestSplit(
  { data: penguinsData, X: penguinsFeatureFields, y: 'Species' },
  { ratio: 0.7, shuffle: true, seed: 42 }
);

const penguinScaler = new ds.ml.preprocessing.StandardScaler()
  .fit({ data: penguinsSplit.train.data, columns: penguinsFeatureFields });

const penguinsTrainScaled = penguinScaler.transform({
  data: penguinsSplit.train.data,
  columns: penguinsFeatureFields,
  encoders: penguinsSplit.train.metadata.encoders
});

const penguinsTestScaled = penguinScaler.transform({
  data: penguinsSplit.test.data,
  columns: penguinsFeatureFields,
  encoders: penguinsSplit.train.metadata.encoders
});

Integration with Observable Plot

tagent/ds works seamlessly with Observable Plot for visualization:

import * as Plot from '@observablehq/plot';

ds.plot.ordiplot(pcaEstimator.model, {
  colorBy: penguinsData.map(d => d.Species),
  showLoadings: true,
}).show(Plot);

What’s Next?

Learn by Example

Browse the API

Run Interactive Examples

Check out the Examples page with live, runnable code snippets.


Need Help?


Development Setup

Want to contribute? Clone the repository and install dependencies:

git clone https://github.com/tangent-to/ds.git
cd ds
npm install

# Run tests
npm test

# Build
npm run build

See CONTRIBUTING.md for more details.


Back to top

This site uses Just the Docs, a documentation theme for Jekyll.