[8.15上]Henry Baird教授报告

报告题目:1.Document Image Analysis for Digital Libraries

2. Human Interactive Proofs

报 告 人:Dr. Henry Baird ,Professor, IEEE Fellow

报告时间:2006年8月15日(星期二)上午09:00-12:00

报告地点:信息大楼(FIT)1区415

主办单位:大阳城国际娱乐官网

联 系 人:丁晓青62773634

报告摘要:

Document Image Analysis for Digital Libraries

The rapid growth of digital libraries(DLs) worldwide poses many challenges for document image analysis(DIA) research and development.DLs promise to offer more people access to larger document collections,and at far greater speed,than physical libraries can.But DLs also tend to serve poorly many types of non-digital human-legible mediasuch as printed and handwritten documents.These documents,in theirphysical (undigitized) form, are easy for people to read and browse,whereas when they are accessed through DLs they often lose theseadvantages while of course lacking many advantages of symbolicallyencoded information.This talk explores these issues and illustratesthem with case studies arising in several DL projects in the US.Difficult open DIA technical problems in DL applications are identified, for example during image capture,early image processing,content extraction and recognition, image presentation, and retrieval---and in personal and interactive DL settings.Recent researchat Lehigh Univ. on highly versatile document image contentextraction algorithms using fast hashed k-D tree classifiers is alsosummarized.

[Joint work with Michael Moll and Matthew Casey.]

Human Interactive Proofs

Internet services offered for human use are suffering abuse bycomputer programs ('bots, spiders, scrapers, etc). We can defendagainst such attacks with CAPTCHAs---Completely Automatic PublicTuring tests to tell Computers and Human Apart---which are specialcases of `human interactive proofs' (HIPs), security protocols allowingpeople easily to authenticate themselves over networks as membersof given groups.I will review six years of HIP R&D, share highlights

of the first two HIP workshops(the most recent held at LehighUniv.), and describe CAPTCHAs now in use and on thehorizon.One of the best ways to engineer a CAPTCHA is to exploit the gapin ability between humans and machines in attempting to read imagesof text.I will analyze the strengths and weaknesses of severalsuch reading-based CAPTCHAs, and give details of ScatterType,developed here in collaboration with Avaya Labs.Its legibility has

been validated by experiments on human subjects.Recently we haveexplored tradeoffs between the familiarity of challenge stringsand image degradation in an attempt to control the difficulty ofCAPTCHA recognition.

[Joint work with Terry Riopka,Michael Moll,Dan Lopresti,Sui-Yu Wang,Jon Bentley,and Colin Mallows.]

报告人简介:

Dr. Baird is a Professor of Computer Science & Engineering at Lehigh Univ. and (with Dan Lopresti) heads up Lehigh's Pattern Recognition Research lab. Prior to joining academia he was a researcher and research manager at Bell Labs and the Xerox Palo Alto Research Center. He has been elected Fellow of the IEEE and also of the IAPR, and has received an ICDAR Outstanding Contributions award. He has served on the Editorial Board of several journals including IEEE Trans on PAMI and CVIU; and he was a founding member of the Editorial Board of the Int'l J. on Document Analysis and Recognition. He has published three books and seventy-six technical articles, and he holds seven patents. He has been founder, co-organizer, or program co-chair for six conferences and workshops.