Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

54
Modéliser la ressource web, contextualiser la référence : des enjeux pour le patrimoine numérique Philoweb’10 [email protected] r Equipe Edelweiss – INRIA Sophia Antipolis Modeling the Web Resource, extracting the context : Stakes for the digital memory

description

 

Transcript of Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Page 1: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Modéliser la ressource web, contextualiser la référence :

des enjeux pour le patrimoine numérique

Philoweb’[email protected]

rEquipe Edelweiss – INRIA Sophia Antipolis

Modeling the Web Resource, extracting the context : Stakes for the digital memory

Page 2: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Motivating scenario

Bookmark overload- Every day usage of bookmarks is boring- The Bookmark model hasn’t changed since Netscape- The Bookmark reference is inaccurate- BM is very difficult to reuse out of the browser

Yet, Bookmarking system is one of the main way to access WWW

(after Google of course...)

Page 3: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Motivating scenario

What are people doing online ??- Looking for information

Tutorials, Wikipedia… RSS, Google alerts

- Producing information Communicating about themselves

- Doing some social activity Twitter, Facebook, Blogs, Forums,…

- Checking news about web sites or topics of interest Business Intelligence

- Using online applications or services Webmail, Google docs e-commerce, e-banking, e-administration Intranet

Different activities, Different objects,

but a single tool to organize all.

Page 4: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Is that all we can do?

Page 5: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Motivating scenario

Why is this technology so poor ?Why is it so difficult to design orientation tools ?

Web User Point of View

Page 6: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.
Page 7: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Possible part of the answer

Once again…the technique is in advance over theory- The Web has evolved really fast- This evolution was mainly technology-driven- Lack of definitions

Mainly technical definitions are available

Questions to be answered : - What is a Web site ?- What is a Web page ?- What is behind a Web reference ?- Is the Web made of digital documents or is it a big soup of

web resources?

Page 8: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web resources are :- growing- heterogenous

Observation #1

Page 9: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

IPv6 : Adresses multiplication

667 x 1015 IP adresses per mm2

Page 10: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Internet of Things

Page 11: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

RFID

Page 12: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

QR Codes

Page 13: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Semantic WebThe Web of Data

Page 14: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

RDF : Identity Crisis

HTTP code 303

Source : H. Halpin, V. Presutti

Information Resourcevs.

Non Information Resource

Page 15: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

URI : W3C Best Practices

Source : W3C, http://www.w3.org/TR/cooluris/

Content Negociation (Conneg)

Linking Open Data (LOD)

Page 16: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

May 2007

April 2008

September 2008

March 2009

Linking Open Data

Page 17: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Linking Open Data

September 2010

Page 18: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

How to avoid disorientation ?

http:

//w

ww

.dbp

edia

.org

/id/

Eiffe

lTow

er

http:

//w

ww

.dbp

edia

.org

/doc

/Eiff

elTo

wer

dbpe

dia:

//re

sour

ce/E

iffel

Tow

er

Page 19: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web reference is unreliable

Observation #2

Page 20: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

HTTP 404Page not found

No Web without the 404 error code.

Page 21: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Dynamic Web sites

First instability cause : A (potential) content

generation at every request

Page 22: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Dynamic Web sites

Second instability cause : client scripts execution

during the reading.

Page 23: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Resident Evil

« Pour être sûr que je demeure fidèle à ma résolution de ne pas accepter comme vrai rien qui ne soit pas absolument certain, j’assumerai délibérément qu’un démon tout-puissant est continuellement en train de me tromper au sujet de l’existence du monde physique, incluant même mon propre corps. »

Méditations Métaphysiques, Descartes

Page 24: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Proof by Example

Capturing a web site homepage « Le Monde »

Page 25: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

11th October 12th October

Page 26: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Documentation initiatives & Tools

Observation #3

Page 27: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Private Libraries

Page 28: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web Archiving French Legal Deposit

IIPC

Wayback Machine…

Petabox – Wayback Machine

Page 29: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Social Bookmarking

Page 30: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Social Tagging

Page 31: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Content syndication

Page 32: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

ScrappingWozaik Zotero

Page 33: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Cartography

Page 34: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

The Web native documentation model is insufficient.

What do we refer ?

Observation #4

Page 35: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Page based model

How to refer a web site as a whole?

Homepage

Page 36: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Looks like a document but…

Is it?

Excel

Google Docs

URL

HTML Source Code

Page 37: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

HTML5 + Javascript new API

Web pages are becoming Web applications- Future of the web :

an open repository of Web applications ?

Yes this is a Web page !

Page 38: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Hypothesis

Page 39: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Building a Conceptual Framework

Assertions- Modeling the Web objects and their references will

help to design orientation solutions- Reference types can only be defined from a user

point of view

Page 40: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web Page

What it is :A Web re-presentation (of a resource state)

An Information Medium+

An Iteraction Device

What it is not : A Memory extension (Tertiary Retention) *

A Document

*this property confers a great communication reactivity (data stream ?)

Page 41: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web spaces

The Web is made of several layers

P : Pages available through HTTP.S : Web services available through HTTPD : Data available through HTTP

Page 42: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Web spaces

Intersection Kind of Resources

P* Web 1.0

S* Web services for composition

D* Open Databases (RDF ou autres, sitemaps, LDAP…)

SP Web 2.0, RIA, collaborative sites, e-commerce, e-banking…

DS Connectors, Data convertors, SPARQL End points, OKKAM…

DP RDFa annotated pages (ex : OGP), Microformats, Microdata

DPS Pages « conneg ready », DBPedia

* exclusive

Page 43: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Webmark : Enhanced Bookmark

Webmark, aims- To redesign the management of the references- To analyse the intentionnality of the marking- To exploit the context of the marking- To propose dedicated services according

to the kind of the marking.

Page 44: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Intentionnality of the Marking

Identified kinds of marking- Content mark

interest for the content of the resource- Location mark

Interest for a place, a community- Application mark

Alias to favorite online applications- Interest Mark

The famous “I like it” or FOAF interest- Composition Mark

A service to be used later in a process

Page 45: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Let’s play

What kind of reference is it ?

Page 46: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 47: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 48: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 49: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 50: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 51: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 52: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

+

Page 53: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Thank you for your attention

Page 54: Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Questions ?