Skip to content

Search By Text

Nostalgia DataFrames can be queried using pandas-like functionality. Look at the class for all available methods.

Let's consider the following example data:

In[11]: from nostalgia.sources.chrome_history import WebHistory
web = WebHistory.load()
web.tail(n=2)

Out[11]:
              domain  domain_and_suffix                               time
98477      archlinux      archlinux.org   2019-12-23 01:23:38.429000+01:00
98478  stackoverflow  stackoverflow.com   2019-12-23 01:27:22.652000+01:00

                                                   title
98477  [BUG FILED] pyqt5-common removed from repos; u...
98478  Stack Overflow - Where Developers Learn, Share...

                                                     url
98477  https://bbs.archlinux.org/viewtopic.php?id=251356
98478                         https://stackoverflow.com/

                                                    path
98477  ~/nostalgia_data/html/1577060618.4289877_https...
98478  ~/nostalgia_data/html/1577060842.6447568_https...

ndf.containing

It is possible to query all the string columns at once (when col_name=None, searches all object type columns).

def containing(string, col_name=None, case=False, regex=True, na=False, bound=True):

For each row, if any of those columns contain the string it will be returned. It takes the regex and case arguments to allow it to be interpreted as a regex, and whether to be case-sensitive.

When bound=True it means to add word boundaries ("\b") to the regex on both sides.

The following will return rows in which any of the text columns contains "sweet" as a word, but not "sweettooth" since bound=True.

web.containing("sweet")

ndf.query

A wrapper around pandas df.query: use expressions to filter and return a subselection, e.g.:

web.query("url == 'https://github.com/nostalgia-dev'")