Mikael Gueck
Mikael Gueck

Reputation: 5591

Freely available example datasets of hierarchical information, and realistic names

I'm about to write some example applications and accompanying documents comparing ways of accessing information stored in relational databases. To demonstrate real-life requirements, I need to include a realistic dataset of hundreds of thousands of facts.

Is anyone aware of publicly available, free datasets of that magnitude, of datasets of human names with human-level variance, or hierarchical datasets of either large organizational hierarchies, or large hierarchical, categorized, product catalogues?

Please point me in the right direction, if you are.


Part 1, human names: http://timecenter.cs.aau.dk/software.htm

Part 2, hierarchical data: no answer yet

Upvotes: 5

Views: 5295

Answers (3)

S.Lott
S.Lott

Reputation: 391952

Your own PC's directory tree is a large hierarchical structure with lots of facts. You probably have a few thousand "Facts" which are file names, modification dates, sizes, extra OS info, etc., etc.

If that's not large enough, find a server that you can login to. That will be larger.

Not large enough? Get a web crawler and start crawling a big web site. That can be as large as you have the patience to crawl.

Upvotes: 3

ChristopheD
ChristopheD

Reputation: 116237

The wikipedia dump is pretty massive: obligatory wikipedia link.

Upvotes: 3

Related Questions