Tabular Datasets#

In this guide we will explore how to work with tabular data in HoloViews. Tabular data has a fixed list of column headings, with values stored in an arbitrarily long list of rows. Spreadsheets, relational databases, CSV files, and many other typical data sources fit naturally into this format. HoloViews defines an extensible system of interfaces to load, manipulate, and visualize this kind of data, as well as allowing conversion of any of the non-tabular data types into tabular data for analysis or data interchange.

By default HoloViews will use one of these data storage formats for tabular data:

  • A pure Python dictionary containing 1D NumPy-arrays for each column.

    {'x': np.array([0, 1, 2]), 'y': np.array([0, 1, 2])}

  • A purely NumPy array format for numeric data.

    np.array([[0, 0], [1, 1], [2, 3]])

  • Pandas DataFrames

    pd.DataFrame(np.array([[0, 0], [1, 1], [2, 3]]), columns=['x', 'y'])

  • Dask DataFrames

  • cuDF Dataframes

A number of additional standard constructors are supported:

  • A tuple of array (or array-like) objects

    ([0, 1, 2], [0, 1, 2])

  • A list of tuples:

    [(0, 0), (1, 1), (2, 2)]

import numpy as np
import pandas as pd
import holoviews as hv
from holoviews import opts

hv.extension('bokeh', 'matplotlib')

opts.defaults(opts.Scatter(size=10))