Monday, September 01, 2008

Federated registries and crawlers

Deepal has revealed some of the things we discussed over the chat during the last weekend regarding web services discovery mechanisms.
I started to look into WSO2 registry to help with another PhD student but didn’t have much time to dig into the architecture level as I was busy during last week with my studies.
Yeah… the problem with having multiple registries in a heterogeneous environment is that it makes really difficult to find web services information, which is essential part in SOA. As web services grows from hundreds to thousands, the consumers or the clients need to have an efficient way to locate them. And publishers also need to attract clients without going through other marketing channels and gimmicks.
One such approach is discussed in this paper, which uses a crawler engine to find web services. In this approach the Crawler Engine (WSCE) actively crawls excising UBRs and search engines to collect web services information. Thus a system can maintain most up-to-date information about available web services. Web services information can be found using existing web services registries and web services portals. And also via search engines, which is becoming popular.

[Source : Eyhab, A.-M. and H.M. Qusay, Investigating web services on the world wide web, in Proceeding of the 17th international conference on World Wide Web. 2008, ACM: Beijing, China]

But using search engines too have limitations as they do not recognize web services with basic service properties such as binding information, ports, operations etc. And search engines can cache/store WSDL documents but there is no business-centric model or adhering to web services standards.

Another approach discussed in this paper is to form a federation of registries. The current search facilities offered by the latest version of UDDI do not offer any special features for finding Web service registries depending on the business domains. And it is difficult to have a design and execution autonomy for affiliated registries. The approach discusses in the paper allows peer to peer network of private, semi-private and public UDDI registries, which allow transparent access to registries in a federated environment. Following are the essential features of the approach

-Participating registries are autonomous registries that can be private or public
-Participating registries can be part of multiple federations
-Participating registries can be heterogeneous. Can have different data models and APIs
-Participating registries can arbitrary join and leave the federation. This is something that we cannot achieve with the UDDI replication support in V3
-Participating registries will have the design and execution autonomy
-The federation of registries can be formed as a market place for common interests
-The XTRO or the extended registries ontology provides a way o do complex queries across federations

So, in overall there is a requirement of adhering to a common standards as well as developing mechanisms to retrieve web services information from the repositories built upon multiple standards. IMO the latter is much better as it does not limit to a particular standard. (Lessons from the history)

No comments: