Information flows

The Number of Scholarly Documents on the Public Web


The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search.

Madian Khabsa, C. Lee Giles

Dynamic Network of Concepts from Web-Publications


The network, the nodes of which are concepts (people's names, companies' names, etc.), extracted from web-publications, is considered. A working algorithm of extracting such concepts is presented. Edges of the network under consideration refer to the reference frequency which depends on the fact how many times the concepts, which correspond to the nodes, are mentioned in the same documents.

