Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Post on 13-Jan-2015

1.506 views 0 download

description

 

Transcript of Nicolas Delaforge: Modeling the Web resource, extracting the context: stakes for digital memory.

Modéliser la ressource web, contextualiser la référence :

des enjeux pour le patrimoine numérique

Philoweb’10nicolas.delaforge@inria.f

rEquipe Edelweiss – INRIA Sophia Antipolis

Modeling the Web Resource, extracting the context : Stakes for the digital memory

Motivating scenario

Bookmark overload- Every day usage of bookmarks is boring- The Bookmark model hasn’t changed since Netscape- The Bookmark reference is inaccurate- BM is very difficult to reuse out of the browser

Yet, Bookmarking system is one of the main way to access WWW

(after Google of course...)

Motivating scenario

What are people doing online ??- Looking for information

Tutorials, Wikipedia… RSS, Google alerts

- Producing information Communicating about themselves

- Doing some social activity Twitter, Facebook, Blogs, Forums,…

- Checking news about web sites or topics of interest Business Intelligence

- Using online applications or services Webmail, Google docs e-commerce, e-banking, e-administration Intranet

Different activities, Different objects,

but a single tool to organize all.

Is that all we can do?

Motivating scenario

Why is this technology so poor ?Why is it so difficult to design orientation tools ?

Web User Point of View

Possible part of the answer

Once again…the technique is in advance over theory- The Web has evolved really fast- This evolution was mainly technology-driven- Lack of definitions

Mainly technical definitions are available

Questions to be answered : - What is a Web site ?- What is a Web page ?- What is behind a Web reference ?- Is the Web made of digital documents or is it a big soup of

web resources?

Web resources are :- growing- heterogenous

Observation #1

IPv6 : Adresses multiplication

667 x 1015 IP adresses per mm2

Internet of Things

RFID

QR Codes

Semantic WebThe Web of Data

RDF : Identity Crisis

HTTP code 303

Source : H. Halpin, V. Presutti

Information Resourcevs.

Non Information Resource

URI : W3C Best Practices

Source : W3C, http://www.w3.org/TR/cooluris/

Content Negociation (Conneg)

Linking Open Data (LOD)

May 2007

April 2008

September 2008

March 2009

Linking Open Data

Linking Open Data

September 2010

How to avoid disorientation ?

http:

//w

ww

.dbp

edia

.org

/id/

Eiffe

lTow

er

http:

//w

ww

.dbp

edia

.org

/doc

/Eiff

elTo

wer

dbpe

dia:

//re

sour

ce/E

iffel

Tow

er

Web reference is unreliable

Observation #2

HTTP 404Page not found

No Web without the 404 error code.

Dynamic Web sites

First instability cause : A (potential) content

generation at every request

Dynamic Web sites

Second instability cause : client scripts execution

during the reading.

Resident Evil

« Pour être sûr que je demeure fidèle à ma résolution de ne pas accepter comme vrai rien qui ne soit pas absolument certain, j’assumerai délibérément qu’un démon tout-puissant est continuellement en train de me tromper au sujet de l’existence du monde physique, incluant même mon propre corps. »

Méditations Métaphysiques, Descartes

Proof by Example

Capturing a web site homepage « Le Monde »

11th October 12th October

Documentation initiatives & Tools

Observation #3

Private Libraries

Web Archiving French Legal Deposit

IIPC

Wayback Machine…

Petabox – Wayback Machine

Social Bookmarking

Social Tagging

Content syndication

ScrappingWozaik Zotero

Cartography

The Web native documentation model is insufficient.

What do we refer ?

Observation #4

Page based model

How to refer a web site as a whole?

Homepage

Looks like a document but…

Is it?

Excel

Google Docs

URL

HTML Source Code

HTML5 + Javascript new API

Web pages are becoming Web applications- Future of the web :

an open repository of Web applications ?

Yes this is a Web page !

Hypothesis

Building a Conceptual Framework

Assertions- Modeling the Web objects and their references will

help to design orientation solutions- Reference types can only be defined from a user

point of view

Web Page

What it is :A Web re-presentation (of a resource state)

An Information Medium+

An Iteraction Device

What it is not : A Memory extension (Tertiary Retention) *

A Document

*this property confers a great communication reactivity (data stream ?)

Web spaces

The Web is made of several layers

P : Pages available through HTTP.S : Web services available through HTTPD : Data available through HTTP

Web spaces

Intersection Kind of Resources

P* Web 1.0

S* Web services for composition

D* Open Databases (RDF ou autres, sitemaps, LDAP…)

SP Web 2.0, RIA, collaborative sites, e-commerce, e-banking…

DS Connectors, Data convertors, SPARQL End points, OKKAM…

DP RDFa annotated pages (ex : OGP), Microformats, Microdata

DPS Pages « conneg ready », DBPedia

* exclusive

Webmark : Enhanced Bookmark

Webmark, aims- To redesign the management of the references- To analyse the intentionnality of the marking- To exploit the context of the marking- To propose dedicated services according

to the kind of the marking.

Intentionnality of the Marking

Identified kinds of marking- Content mark

interest for the content of the resource- Location mark

Interest for a place, a community- Application mark

Alias to favorite online applications- Interest Mark

The famous “I like it” or FOAF interest- Composition Mark

A service to be used later in a process

Let’s play

What kind of reference is it ?

+

+

+

+

+

+

+

Thank you for your attention

Questions ?