Getting started

Get up and running with DS in minutes.

Table of contents

  1. Installation
    1. NPM
    2. Deno
    3. CDN (Browser)
  2. First analysis
    1. 1. Import the library
    2. 2. Prepare your data
    3. 3. Run the analysis
  3. Core concepts
    1. Declarative API
    2. Fit-Transform Pattern
    3. Integration with Observable Plot
  4. What’s Next?
    1. Learn by Example
    2. Browse the API
    3. Run Interactive Examples
  5. Need Help?
  6. Development Setup

Installation

NPM

npm install @tangent/ds

Deno

deno add @tangent/ds

CDN (Browser)

<script type="module">
  import * as ds from 'https://cdn.jsdelivr.net/npm/@tangent/ds/+esm';
</script>

First analysis

Let’s run a simple t-test to compare two groups.

1. Import the library

import * as ds from '@tangent/ds';

2. Prepare your data

const penguinsResponse = await fetch(
  'https://cdn.jsdelivr.net/npm/vega-datasets@2/data/penguins.json',
);
const penguinsData = await penguinsResponse.json();
console.table(penguinsData.slice(0, 5));

3. Run the analysis

const tested_variable = "Body Mass (g)";

const adelie_var = penguinsData
  .filter((d) => d.Species == "Adelie")
  .map((d) => d[tested_variable]);

const chinstrap_var = penguinsData
  .filter((d) => d.Species == "Chinstrap")
  .map((d) => d[tested_variable]);

const ttest = ds.stats.hypothesis.twoSampleTTest(adelie_var, gentoo_var);

console.log(ttest);
{
  statistic: -18.42963067630639,
  pValue: 0.0020000000000000018,
  df: 274,
  mean1: 3676.315789473684,
  mean2: 5035.080645161291,
  pooledSE: 73.72718854504609,
  alternative: "two-sided"
}

— ICICICICICICICICI

Core concepts

Declarative API

DS uses a declarative approach where you describe your data and analysis:

const penguinsFeatureFields = [
  'Beak Length (mm)',
  'Beak Depth (mm)',
  'Flipper Length (mm)',
  'Body Mass (g)',
];

const pcaData = penguinsData.map(d =>
  penguinsFeatureFields.reduce((row, field) => {
    row[field] = d[field];
    return row;
  }, {})
);

const pca = new ds.mva.PCA({
  center: true,
  scale: true,
  scaling: 2, // correlation biplot
  omit_missing: true
});

pca.fit({data: pcaData});

Fit-Transform Pattern

Many methods follow the fit-transform pattern from scikit-learn:

const penguinsSplit = ds.ml.validation.trainTestSplit(
  { data: penguinsData, X: penguinsFeatureFields, y: 'Species' },
  { ratio: 0.7, shuffle: true, seed: 42 }
);

const penguinScaler = new ds.ml.preprocessing.StandardScaler()
  .fit({ data: penguinsSplit.train.data, columns: penguinsFeatureFields });

const penguinsTrainScaled = penguinScaler.transform({
  data: penguinsSplit.train.data,
  columns: penguinsFeatureFields,
  encoders: penguinsSplit.train.metadata.encoders
});

const penguinsTestScaled = penguinScaler.transform({
  data: penguinsSplit.test.data,
  columns: penguinsFeatureFields,
  encoders: penguinsSplit.train.metadata.encoders
});

Integration with Observable Plot

DS works seamlessly with Observable Plot for visualization:

import * as Plot from '@observablehq/plot';

ds.plot.ordiplot(pcaEstimator.model, {
  colorBy: penguinsData.map(d => d.Species),
  showLoadings: true,
}).show(Plot);

What’s Next?

Learn by Example

Browse the API

Run Interactive Examples

Check out the Examples page with live, runnable code snippets.


Need Help?


Development Setup

Want to contribute? Clone the repository and install dependencies:

git clone https://github.com/tangent-to/ds.git
cd ds
npm install

# Run tests
npm test

# Build
npm run build

See CONTRIBUTING.md for more details.