Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
DigitalCollections@ILR
ILR School
  1. Home
  2. ILR School
  3. Centers, Institutes, Programs
  4. Labor Dynamics Institute
  5. NSF Census Research Network
  6. Cornell University NCRN node
  7. b-Bit Minwise Hashing in Practice

b-Bit Minwise Hashing in Practice

File(s)
a13-li.pdf (454.65 KB)
Main article
Permanent Link(s)
https://hdl.handle.net/1813/37986
https://hdl.handle.net/1813/37986
Collections
Cornell University NCRN node
Author
Li, Ping
Shrivastava, Anshumali
König, Arnd Christian
Abstract

Minwise hashing is a standard technique in the context of search for approximating set similarities. The recent work [26, 32] demon- strated a potential use of b-bit minwise hashing [23, 24] for ef- ficient search and learning on massive, high-dimensional, binary data (which are typical for many applications in Web search and text mining). In this paper, we focus on a number of critical is- sues which must be addressed before one can apply b-bit minwise hashing to the volumes of data often used industrial applications.

Sponsorship
NSF Grant #1131848.
Date Issued
2013-10
Publisher
Fifth Asia-Pacific Symposium on Internetware
Previously Published as
Ping Li, Anshumali Shrivastava and Arnd Christian König. b-Bit Minwise Hashing in Practice. Internetware 2013. October 2013.
Type
preprint

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance