Why Web Services?

Overview
Component-based programming has become more popular than ever. Hardly an application is built today that does not involve leveraging components in some form, usually from different vendors. As applications have grown more sophisticated, the need to leverage components distributed on remote machines has also grown.
An example of a component-based application is an end-to-end e-commerce solution. An e-commerce application residing on a Web farm needs to submit orders to a back-end Enterprise Resource Planning (ERP) application. In many cases, the ERP application resides on different hardware and might run on a different operating system.
The Microsoft Distributed Component Object Model (DCOM), a distributed object infrastructure that allows an application to invoke Component Object Model (COM) components installed on another server, has been ported to a number of non-Windows platforms. But DCOM has never gained wide acceptance on these platforms, so it is rarely used to facilitate communication between Windows and non-Windows computers. ERP software vendors often create components for the Windows platform that communicate with the back-end system via a proprietary protocol.
Some services leveraged by an e-commerce application might not reside within the datacenter at all. For example, if the e-commerce application accepts credit card payment for goods purchased by the customer, it must elicit the services of the merchant bank to process the customer's credit card information. But for all practical purposes, DCOM and related technologies such as CORBA and Java RMI are limited to applications and components installed within the corporate datacenter. Two primary reasons for this are that by default these technologies leverage proprietary protocols and these protocols are inherently connection oriented.
Clients communicating with the server over the Internet face numerous potential barriers to communicating with the server. Security-conscious network administrators around the world have implemented corporate routers and firewalls to disallow practically every type of communication over the Internet. It often takes an act of God to get a network administrator to open ports beyond the bare minimum.
If you're lucky enough to get a network administrator to open up the appropriate ports to support your service, chances are your clients will not be as fortunate. As a result, proprietary protocols such those used by DCOM, CORBA, and Java RMI are not practical for Internet scenarios.
The other problem, as I said, with these technologies is that they are inherently connection oriented and therefore cannot handle network interruptions gracefully. Because the Internet is not under your direct control, you cannot make any assumptions about the quality or reliability of the connection. If a network interruption occurs, the next call the client makes to the server might fail.
The connection-oriented nature of these technologies also makes it challenging to build the load-balanced infrastructures necessary to achieve high scalability. Once the connection between the client and the server is severed, you cannot simply route the next request to another server.
Developers have tried to overcome these limitations by leveraging a model called stateless programming, but they have had limited success because the technologies are fairly heavy and make it expensive to reestablish a connection with a remote object.
Because the processing of a customer's credit card is accomplished by a remote server on the Internet, DCOM is not ideal for facilitating communication between the e-commerce client and the credit card processing server. As in an ERP solution, a third-party component is often installed within the client's datacenter (in this case, by the credit card processing solution provider). This component serves as little more than a proxy that facilitates communication between the e-commerce software and the merchant bank via a proprietary protocol.
Do you see a pattern here? Because of the limitations of existing technologies in facilitating communication between computer systems, software vendors have often resorted to building their own infrastructure. This means resources that could have been used to add improved functionality to the ERP system or the credit card processing system have instead been devoted to writing proprietary network protocols.
In an effort to better support such Internet scenarios, Microsoft initially adopted the strategy of augmenting its existing technologies, including COM Internet Services (CIS), which allows you to establish a DCOM connection between the client and the remote component over port 80. For various reasons, CIS was not widely accepted.
It became clear that a new approach was needed. So Microsoft decided to address the problem from the bottom up. Let's look at some of the requirements the solution had to meet in order to succeed.
  • Interoperability The remote service must be able to be consumed by clients on other platforms.

  • Internet friendliness The solution should work well for supporting clients that access the remote service from the Internet.

  • Strongly typed interfaces There should be no ambiguity about the type of data sent to and received from a remote service. Furthermore, datatypes defined by the remote service should map reasonably well to datatypes defined by most procedural programming languages.

  • Ability to leverage existing Internet standards The implementation of the remote service should leverage existing Internet standards as much as possible and avoid reinventing solutions to problems that have already been solved. A solution built on widely adopted Internet standards can leverage existing toolsets and products created for the technology.

  • Support for any language The solution should not be tightly coupled to a particular programming language. Java RMI, for example, is tightly coupled to the Java language. It would be difficult to invoke functionality on a remote Java object from Visual Basic or Perl. A client should be able to implement a new Web service or use an existing Web service regardless of the programming language in which the client was written.

  • Support for any distributed component infrastructure The solution should not be tightly coupled to a particular component infrastructure. In fact, you shouldn't be required to purchase, install, or maintain a distributed object infrastructure just to build a new remote service or consume an existing service. The underlying protocols should facilitate a base level of communication between existing distributed object infrastructures such as DCOM and CORBA.
Given the title of this book, it should come as no surprise that the solution Microsoft created is known as Web services. A Web service exposes an interface to invoke a particular activity on behalf of the client. A client can access the Web service through the use of Internet standards.
Web Services Building Blocks
The following graphic shows the core building blocks needed to facilitate remote communication between two applications.
Let's discuss the purpose of each of these building blocks. Because many readers are familiar with DCOM, I will also mention the DCOM equivalent of each building block.
  • Discovery The client application that needs access to functionality exposed by a Web service needs a way to resolve the location of the remote service. This is accomplished through a process generally termed discovery. Discovery can be facilitated via a centralized directory as well as by more ad hoc methods. In DCOM, the Service Control Manager (SCM) provides discovery services.

  • Description Once the end point for a particular Web service has been resolved, the client needs sufficient information to properly interact with it. The description of a Web service encompasses structured metadata about the interface that is intended to be consumed by a client application as well as written documentation about the Web service including examples of use. A DCOM component exposes structured metadata about its interfaces via a type library (typelib). The metadata within a component's typelib is stored in a proprietary binary format and is accessed via a proprietary application programming interface (API).

  • Message format In order to exchange data, a client and a server have to agree on a common way to encode and format the messages. A standard way of encoding data ensures that data encoded by the client will be properly interpreted by the server. In DCOM, messages sent between a client and a server are formatted as defined by the DCOM Object RPC (ORPC) protocol.
Without a standard way of formatting the messages, developing a toolset to abstract the developer from the underlying protocols is next to impossible. Creating an abstraction layer between the developer and the underlying protocols allows the developer to focus more on the business problem at hand and less on the infrastructure required to implement the solution.
  • Encoding The data transmitted between the client and the server needs to be encoded into the body of the message. DCOM uses a binary encoding scheme to serialize the data contained by the parameters exchanged between the client and the server.

  • Transport Once the message has been formatted and the data has been serialized into the body of the message, the message must be transferred between the client and the server over some transport protocol. DCOM supports a number of proprietary protocols bound to a number of network protocols such as TCP, SPX, NetBEUI, and NetBIOS over IPX.
Web Services Design Decisions
Let's discuss some of the design decisions behind these building blocks for Web services.
Choosing Transport Protocols
The first step was to determine how the client and the server would communicate with each other. The client and the server can reside on the same LAN, but the client might potentially communicate with the server over the Internet. Therefore, the transport protocol must be equally suited to LAN environments and the Internet.
As I mentioned earlier, technologies such as DCOM, CORBA, and Java RMI are ill suited for supporting communication between the client and the server over the Internet. Protocols such as Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP) are proven Internet protocols. HTTP defines a request/response messaging pattern for submitting a request and getting an associated response. SMTP defines a routable messaging protocol for asynchronous communication. Let's examine why HTTP and SMTP are well suited for the Internet.
HTTP-based Web applications are inherently stateless. They do not rely on a continuous connection between the client and the server. This makes HTTP an ideal protocol for high-availability configurations such as firewalls. If the server that handled the client's original request becomes unavailable, subsequent requests can be automatically routed to another server without the client knowing or caring.
Almost all companies have an infrastructure in place that supports SMTP. SMTP is well suited for asynchronous communication. If service is disrupted, the e-mail infrastructure automatically handles retries. Unlike with HTTP, you can pass SMTP messages to a local mail server that will attempt to deliver the mail message on your behalf.
The other significant advantage of both HTTP and SMTP is their pervasiveness. Employees have come to rely on both e-mail and their Web browsers, and network administrators have a high comfort level supporting these services. Technologies such as network address translation (NAT) and proxy servers provide a way to access the Internet via HTTP from within otherwise isolated corporate LANs. Administrators will often expose an SMTP server that resides inside the firewall. Messages posted to this server will then be routed to their final destination via the Internet.
In the case of credit card processing software, an immediate response is needed from the merchant bank to determine whether the order should be submitted to the ERP system. HTTP, with its request/response message pattern, is well suited to this task.
Most ERP software packages are not capable of handling large volumes of orders that can potentially be driven from the e-commerce application. In addition, it is not imperative that the orders be submitted to the ERP system in real time. Therefore, SMTP can be leveraged to queue orders so that they can be processed serially by the ERP system.
If the ERP system supports distributed transactions, another option is to leverage Microsoft Message Queue Server (MSMQ). As long as the e-commerce application and the ERP system reside within the same LAN, connectivity via non-Internet protocols is less of an issue. The advantage MSMQ has over SMTP is that messages can be placed and removed from the queue within the scope of a transaction. If an attempt to process a message that was pulled off the queue fails, the message will automatically be placed back in the queue when the transaction aborts.
Choosing an Encoding Scheme
HTTP and SMTP provide a means of sending data between the client and the server. However, neither specifies how the data within the body of the message should be encoded. Microsoft needed a standard, platform-neutral way to encode data exchanged between the client and the server.
Because the goal was to leverage Internet-based protocols, Extensible Markup Language (XML) was the natural choice. XML offers many advantages, including cross-platform support, a common type system, and support for industry -standard character sets.
Binary encoding schemes such as those used by DCOM, CORBA, and Java RMI must address compatibility issues between different hardware platforms. For example, different hardware platforms have different internal binary representation of multi-byte numbers. Intel platforms order the bytes of a multi-byte number using the little endian convention; many RISC processors order the bytes of a multi-byte number using the big endian convention.
XML avoids binary encoding issues because it uses a text-based encoding scheme that leverages standard character sets. Also, some transport protocols, such as SMTP, can contain only text-based messages.
Binary methods of encoding, such as those used by DCOM and CORBA, are cumbersome and require a supporting infrastructure to abstract the developer from the details. XML is much lighter weight and easier to handle because it can be created and consumed using standard text-parsing techniques.
In addition, a variety of XML parsers are available to further simplify the creation and consumption of XML documents on practically every modern platform. XML is lightweight and has excellent tool support, so XML encoding allows incredible reach because practically any client on any platform can communicate with your Web service.
Choosing a Formatting Convention
It is often necessary to include additional metadata with the body of the message. For example, you might want to include information about the type of services that a Web service needs to provide in order to fulfill your request, such as enlisting in a transaction or routing information. XML provides no mechanism for differentiating the body of the message from its associated data.
Transport protocols such as HTTP provide an extensible mechanism for header data, but some data associated with the message might not be specific to the transport protocol. For example, the client might send a message that needs to be routed to multiple destinations, potentially over different transport protocols. If the routing information were placed into an knust ipad, it would have to be translated before being sent to the next intermediary over another transport protocol, such as SMTP. Because the routing information is specific to the message and not the transport protocol, it should be a part of the message.
Simple Object Access Protocol (SOAP) provides a protocol-agnostic means of associating h

Comments

Popular posts from this blog

There Is an Opposite to Insomnia and It Is Called Excessive Sleepiness

Durasoft Color Contact Lenses

How To Optimize Your SEO for Mobile-First