Archive for the ‘patterns’ Category

For the holiday season, get Mashup Patterns for 30% off!

Thanks to the folks over at I found out about this special deal. It works for other Pearson books too (there are others ;-) ), which covers Addison-Wesley Professional, Cisco Press, Exam Cram, IBM Press, Prentice Hall Professional, QUE and Sams.

Input code “TECHBARG30IT” in step 3 of the checkout process, “Payment Method”. Free Ground Shipping. Tax in most states.

Happy Holidays!

The Biggest Thing I left out of Mashup Patterns

About this time last year I sequestered myself in a rental house on Cape Cod while my wife and parents took my daughters on an endless loop of beach/mini-golf outings. My time was spent finishing up the primary text for Mashup Patterns. For those of you thinking, “But the book only came out a few months ago, how is that possible?” let me just say, “Yes, the publishing process is that complicated”. You can imagine how disruptive eBooks are – but that’s a story for a different time and place.

This year I was back on Cape Cod again, but able to relax a bit myself. It seemed like a good time to reflect back a bit what did (and didn’t) make the final cut. One of the biggest things I left out of Mashup Patterns was a discussion of Screen Scraping. Actually, that’s not entirely accurate. I talk about it a lot in Chapter 1 (“Acquiring Data from the Web”) then again in Chapter 4 (“Data Extraction”) and a little more still in Chapter 4 (“Harvest Patterns”). So how can I say I left it out?

What I talk about specifically is ”scraping” data from Web pages, which I prefer to call “DOM Parsing” since what you’re really doing in traversing the page’s underlying Document Object Model and looking at things like CSS attributes, id’s, names, etc rather than an item’s absolute screen position. “DOM Parsing” is the underlying technique used by products like Kapow, Dapper, Convertigo, and Mozenda.

Why do I bother making this distinction? The reality is that Screen Scraping has a negative connotation in many circles. The old techniques for acquiring data based on X/Y screen position are not well regarded. And justly so; many of us built solutions on top of screen scraping products only to see them fail miserably when a single label was renamed. I wanted readers to know that Web Harvesting was a much more robust and fault-tolerant approach for the twenty-first century.

Have I had any impact? I’m not sure. I’m happy to see sites like Mozenda advertise that they perform Web Harvesting, but of course right above that they claim they do Screen Scraping as well. Argh! Plus, I continue to be on panels and calls and hear the two terms used interchangeably. When a technology is reasonably similar to something a developer has seen before, we have the tendency to use old labels, judge it by previous experiences, etc. This inclination can keep us from recognizing the truly innovative stuff, so it’s a trap we have to watch out for. But I digress; the point I actually want to make is that there is a time and place where Screen Scraping can add value.

I don’t know how many mainframe-based systems with no web front-end are out there, but it’s a stockpile that’s accumulated over decades. A large number, being perfectly suited for their purpose, will continue to linger along unchanged. Perhaps it’s just too expensive to re-write or re-platform them. How do we then leverage these resources in our new creations? For example, how do we take an ancient order-fulfillment system and link it with a snazzy new Salesforce application?

You see where I’m headed. This is a mashup, too. Only instead of the normal cadre of web applications, RSS feeds, SOAP APIs, etc, we want to include mainframe content. And if there are no other avenues in via the database or other feed, then Screen Scraping is a perfectly viable option. In 2004, before the term Mashup was in wide use, David Linthicum’s book,
Next Generation Application Integration: From Simple Information to… talked about this exact approach.

“Leveraging the user interface as a point of information integration is a process known as “screen scraping,” or accessing screen information through a programmatic mechanism. Middleware drives a user interface (e.g., 3270 user interface) in order to access information. Simply put, many application integration projects will have no other choice but to leverage user interfaces to access application data and processes. Sometimes access to underlying databases and application interfaces does not exist.”

David also summarized the all-too-common downsides:

“A user interface was never designed to serve up data, but it is now being used for precisely that purpose. It should go without saying that the data-gathering performance of user screens leaves a lot to be desired. In addition, this type of solution can’t scale, so it is unable to handle more than a few screen interfaces at any given time. Finally, if the application integration architect and developer do not set up these mechanisms carefully, they may prove unstable.”

Nevertheless, sometimes working “at the glass” is our only option. Dozens of companies offer solutions in this space, but only two that I’m aware of (Convertigo and Lansa) have actually connected the dots between interacting with a mainframe and building enterprise mashups. When David wrote his book, it was for an IT audience focused on integrating disparate applications. Today, we realize that besides this lofty goal, sometimes it’s just as useful to mine small nuggets of useful functionality from a system to build something unique and new.

I started this post out with a little reflection on where I was about a year ago. Personally, I think leaving the mainframe discussion out of Mashup Patterns was the right call. But professionally, none of us can afford to ignore the past. Legacy resources are everywhere, and they can easily be incorporated into today’s new mashups. Regardless of the sources underpinning your mashup you need to be aware of their fragility, provide notification and controls for dealing with any unexpected downtime, and incorporate multiple, redundant sources for data when possible. Don’t let a bias against “scraping” keep you from using what may be some of the most valuable functionality in your firm.

What are Enterprise Mashups?

What are enterprise mashups? JackBe is running a contest to come up with a definition everyone can agree on. I think the biggest problem is that people try and define the tools, the goals, and the constituent parts in the same sentence. It leads to cumbersome, wordy, and sometimes narrow definitions like this one:

“Enterprise mashups are integrated business applications that combine data from two or more data sources, including enterprise data sources such as web services, and possibly external web services. The nature of mashups is to provide a quick return on investment using high-productivity tools and techniques. These include AJAX, browser-based tools, and reusing existing web services and web components.” (source

Would a layperson (and potential mashup creator) understand that? It also perpetuates the common misconception that mashups must have more than one data source. Wrong! People assume that prerequisite because “mashups” seems to demand this plurality, but in fact there are advantages to using mashup tools with only a single web site (such as creating an RSS feed or API where non previously existed)

How about this one:

“Mashups are a brute force joining of disparate Web Data, oblivious to the underlying Data Model(s), and often based on RSS 2.0 (a Tree Structure that contains untyped or meaning-challenged links.)” (source: )

A “Brute force” approach sounds neither flexible nor easy, which is ideally the exact opposite of what enterprise mashups aim to be. Let’s look at one more:

“A mashup is a dynamic web application that brings together data stored in many different applications for better decision making” (source: Luis Derechin, JackBe CEO on FOX Business)

We’re getting closer because this is the first definition that answers the “why” about mashups. Why should I care about these things? Luis undoubtedly knows mashups can accomplish a host of things including streamlining a cumbersome process, pushing content to alternative devices, etc. My guess is he honed in on “better decision making” because he was thinking of what would be of interest to the typical FOX Business viewer. But now we’re on the right track.

First, we should recognize that the distinction between a conversational definition – which can be used to spark further discussion and a definition for reference sources like Wikipedia. I’m going to focus on the former case, which is still tricky territory to navigate. It’s like trying to explain what an Operating System does outside of technical circles. Is it important to explain issues like threading, memory management or job scheduling? No; I believe users want to understand what an O/S lets you accomplish and not how it does it. We should hold a definition of mashups to this same principle.

To paraphrase John Crupi’s (JackBe’s CTO) remarks at CeBIT, the definition of enterprise mashups is of little use if it doesn’t communicate their value to the business. To that end, here’s how I explain enterprise mashups when asked:

Enterprise mashups unleash the information locked in a company’s systems and the creativity trapped within its employees to allow anyone to quickly meet specific business challenges.

As I said – this is just a jumping off point. I normally follow-up this statement with a series of questions: “Have you ever used an application that would be perfect if it had just one small change?”, “Have you ever had to wait forever to get the tools you needed to do your job?”, and “Do you constantly waste time cutting and pasting from different applications to get your work done?”

I don’t want a short definition (no matter how “technically correct”) that is obtuse and intimidates users. The key is to pique the listener’s curiosity and draw them in. Once they share their problems I can help them understand how enterprise mashups can help.

Mining the Deep Web with Mashups

I have an introductory article on how to mine the Deep Web (and Deep Intranet!) with mashups over at InformIT. The article is a good introduction for the layperson who may be unaware of how mashups can be used in this manner.

I have to point out one small mistake; the article mentions that the Presto platform (from JackBe) is able to implement the API Enabler pattern through their partnership with Dapper. In fact, their in-house developed EMML (Enterprise Mashup markup language) also makes this possible (w/o needing any Daps).

Twinsoft’s Convertigo platform (not mentioned in the article) is also capable of implementing API Enabler. I’m sure there are other tools I am missing that can create a SOAP or REST api against a presentation layer (either by screen-scraping or the more elegant technique of parsing a page’s DOM). New tools seem to be popping up almost daily! The take-away is that mashups can do more than combine disparate sites together. They can also extract data “at the glass” when a public interface isn’t available (or doesn’t expose the specific information you’re after)

“An Introduction to Enterprise Mashups” webex + Free Chapter 1 Download

This week I had the pleasure of visting JackBe headquarters in Chevy Chase, MD and conducting a joint webex with CTO John Crupi. For those who might not know, John is a well-established figure in the Patterns community having co-authored Sun’s Core J2EE Patterns book. John had the foresight to register more than a year ago and graciously donated the domain to me last year.

JackBe has two case studies in Mashup Patterns, but the purpose of this webex wasn’t to showcase these or their Presto Platform – it was an educational event designed to explain the evolution of Enterprise Mashups, present some of the patterns, and demo a few implementations JackBe had created.

What was more impressive than the record-breaking attendance we received was the quantity and quality of the questions raised during the talk. In fact, so many good issues were brought up that it’s taken us a few days to put all the answers together! The questions demonstrate that people are really starting to understand the promise of Mashups within the enterprise and are thinking about the right things.

As an additional bonus, JackBe secured permission to distribute the first chapter of Mashup Patterns for free. If you want to come up to speed on this new paradigm quickly (or know someone who does) this is a unique collection of great resources.

The video, question/answer archive, and Chapter 1 are available at:

  • TripAdvisor Content
    TripAdvisor is one of the world's largest travel sites featuring reviews and advice on hotels, resorts, flights, vacation rentals, vacation packages, travel guides, and more. TripAdvisor's Content API allows access to business information on travel destinations. API calls can be made using HTTP requests to return data in JSON formats. The API retur […]
  • Health Topics. Net
    Health Topics .Net is an unofficial API that displays health information from recognized sources in English and Spanish. To know more about this application, developers can either access a live preview or a screenshot. This API could be valuable for developers in bilingual markets who work with health related applications. To start, users could register at c […]
  • Time and Date Dialing Codes Service
    With Time and Date users can access time zones, calendars, weather, the world clock and astronomy information. They can also obtain free clocks, free countdowns and APIs. Time and Date Dialing Codes Service API can be used to find out which code serves better to call a specific location. To start application development, developers can review some of the rec […]
  • CenturyLink
    CenturyLink is a telecommunications company that offers cloud services for business, development, SaaS and resellers. The company offers an API that supports REST based HTTP requests in XML, JSON and SOAP protocols. The goals are to display billing, data centers, network usage, upcoming events and activity log. This API may serve as a control portal and coul […]
  • F5
    F5 is an IT company that improves network security. Some of the solutions include networks functions virtualization, secure web gateway and cloud migration. Products involve platforms, modules, BIG-IP and BIG-IQ device, cloud and security. To access more information developers can visit the link iControl CodeShare, where they will have access to iControlREST […]
  • Time and Date Places
    Time and Date Places API can retrieve names of places geographically identified in a list. Recognized parameters include geo and lang and response elements consist of places. Developers can find examples in XML/JSON formats. This API about places can be useful for developers who want to locate information related to a country, a state, latitude and longitude […]
  • The Portal to Texas History
    The Portal to Texas History is a resource to learn about past events occurred in Texas. The organization offers APIs for the Corpus Cristi Museum of Science and History and invites developers to access data without a special key. They can benefit from this API if they are interested to organize databases with science and history content. In addition, develop […]
  • Badips is a free abuse tracker and IP monitoring community offering a way to report and compile blocklists of bad IPs. With an API key, developers can access through a simple HTTP request RESTful API to receive a list of bad IPs along with specific individual IP information. Developers are also open to make a POST method to add new IPs to add […]
  • Time and Date Holidays
    Time and Date Holiday Service API displays a list of observances for over 70 countries. With this API, developers can retrieve holiday name and date, type of holiday and short description of the holiday. They also can access holiday information of the states in the country that observe the holiday. This API could be useful if the goal is to integrate observa […]
  • Time and Date Daylight Saving Time (DST) Worldwide
    With Daylight Saving Time Worldwide API, developers can manage dates, times and zone changes in multiple countries. This service is featured in Time and Date, a company based in Norway that informs users about the world clock, calendars, weather, astronomy and time zones. Particularly with this company, developers can create applications at no cost, but if t […]