Archive for the ‘patterns’ Category

For the holiday season, get Mashup Patterns for 30% off!

Thanks to the folks over at I found out about this special deal. It works for other Pearson books too (there are others ;-)), which covers Addison-Wesley Professional, Cisco Press, Exam Cram, IBM Press, Prentice Hall Professional, QUE and Sams.

Input code “TECHBARG30IT” in step 3 of the checkout process, “Payment Method”. Free Ground Shipping. Tax in most states.

Happy Holidays!

The Biggest Thing I left out of Mashup Patterns

About this time last year I sequestered myself in a rental house on Cape Cod while my wife and parents took my daughters on an endless loop of beach/mini-golf outings. My time was spent finishing up the primary text for Mashup Patterns. For those of you thinking, “But the book only came out a few months ago, how is that possible?” let me just say, “Yes, the publishing process is that complicated”. You can imagine how disruptive eBooks are – but that’s a story for a different time and place.

This year I was back on Cape Cod again, but able to relax a bit myself. It seemed like a good time to reflect back a bit what did (and didn’t) make the final cut. One of the biggest things I left out of Mashup Patterns was a discussion of Screen Scraping. Actually, that’s not entirely accurate. I talk about it a lot in Chapter 1 (“Acquiring Data from the Web”) then again in Chapter 4 (“Data Extraction”) and a little more still in Chapter 4 (“Harvest Patterns”). So how can I say I left it out?

What I talk about specifically is ”scraping” data from Web pages, which I prefer to call “DOM Parsing” since what you’re really doing in traversing the page’s underlying Document Object Model and looking at things like CSS attributes, id’s, names, etc rather than an item’s absolute screen position. “DOM Parsing” is the underlying technique used by products like Kapow, Dapper, Convertigo, and Mozenda.

Why do I bother making this distinction? The reality is that Screen Scraping has a negative connotation in many circles. The old techniques for acquiring data based on X/Y screen position are not well regarded. And justly so; many of us built solutions on top of screen scraping products only to see them fail miserably when a single label was renamed. I wanted readers to know that Web Harvesting was a much more robust and fault-tolerant approach for the twenty-first century.

Have I had any impact? I’m not sure. I’m happy to see sites like Mozenda advertise that they perform Web Harvesting, but of course right above that they claim they do Screen Scraping as well. Argh! Plus, I continue to be on panels and calls and hear the two terms used interchangeably. When a technology is reasonably similar to something a developer has seen before, we have the tendency to use old labels, judge it by previous experiences, etc. This inclination can keep us from recognizing the truly innovative stuff, so it’s a trap we have to watch out for. But I digress; the point I actually want to make is that there is a time and place where Screen Scraping can add value.

I don’t know how many mainframe-based systems with no web front-end are out there, but it’s a stockpile that’s accumulated over decades. A large number, being perfectly suited for their purpose, will continue to linger along unchanged. Perhaps it’s just too expensive to re-write or re-platform them. How do we then leverage these resources in our new creations? For example, how do we take an ancient order-fulfillment system and link it with a snazzy new Salesforce application?

You see where I’m headed. This is a mashup, too. Only instead of the normal cadre of web applications, RSS feeds, SOAP APIs, etc, we want to include mainframe content. And if there are no other avenues in via the database or other feed, then Screen Scraping is a perfectly viable option. In 2004, before the term Mashup was in wide use, David Linthicum’s book,
Next Generation Application Integration: From Simple Information to… talked about this exact approach.

“Leveraging the user interface as a point of information integration is a process known as “screen scraping,” or accessing screen information through a programmatic mechanism. Middleware drives a user interface (e.g., 3270 user interface) in order to access information. Simply put, many application integration projects will have no other choice but to leverage user interfaces to access application data and processes. Sometimes access to underlying databases and application interfaces does not exist.”

David also summarized the all-too-common downsides:

“A user interface was never designed to serve up data, but it is now being used for precisely that purpose. It should go without saying that the data-gathering performance of user screens leaves a lot to be desired. In addition, this type of solution can’t scale, so it is unable to handle more than a few screen interfaces at any given time. Finally, if the application integration architect and developer do not set up these mechanisms carefully, they may prove unstable.”

Nevertheless, sometimes working “at the glass” is our only option. Dozens of companies offer solutions in this space, but only two that I’m aware of (Convertigo and Lansa) have actually connected the dots between interacting with a mainframe and building enterprise mashups. When David wrote his book, it was for an IT audience focused on integrating disparate applications. Today, we realize that besides this lofty goal, sometimes it’s just as useful to mine small nuggets of useful functionality from a system to build something unique and new.

I started this post out with a little reflection on where I was about a year ago. Personally, I think leaving the mainframe discussion out of Mashup Patterns was the right call. But professionally, none of us can afford to ignore the past. Legacy resources are everywhere, and they can easily be incorporated into today’s new mashups. Regardless of the sources underpinning your mashup you need to be aware of their fragility, provide notification and controls for dealing with any unexpected downtime, and incorporate multiple, redundant sources for data when possible. Don’t let a bias against “scraping” keep you from using what may be some of the most valuable functionality in your firm.

What are Enterprise Mashups?

What are enterprise mashups? JackBe is running a contest to come up with a definition everyone can agree on. I think the biggest problem is that people try and define the tools, the goals, and the constituent parts in the same sentence. It leads to cumbersome, wordy, and sometimes narrow definitions like this one:

“Enterprise mashups are integrated business applications that combine data from two or more data sources, including enterprise data sources such as web services, and possibly external web services. The nature of mashups is to provide a quick return on investment using high-productivity tools and techniques. These include AJAX, browser-based tools, and reusing existing web services and web components.” (source

Would a layperson (and potential mashup creator) understand that? It also perpetuates the common misconception that mashups must have more than one data source. Wrong! People assume that prerequisite because “mashups” seems to demand this plurality, but in fact there are advantages to using mashup tools with only a single web site (such as creating an RSS feed or API where non previously existed)

How about this one:

“Mashups are a brute force joining of disparate Web Data, oblivious to the underlying Data Model(s), and often based on RSS 2.0 (a Tree Structure that contains untyped or meaning-challenged links.)” (source: )

A “Brute force” approach sounds neither flexible nor easy, which is ideally the exact opposite of what enterprise mashups aim to be. Let’s look at one more:

“A mashup is a dynamic web application that brings together data stored in many different applications for better decision making” (source: Luis Derechin, JackBe CEO on FOX Business)

We’re getting closer because this is the first definition that answers the “why” about mashups. Why should I care about these things? Luis undoubtedly knows mashups can accomplish a host of things including streamlining a cumbersome process, pushing content to alternative devices, etc. My guess is he honed in on “better decision making” because he was thinking of what would be of interest to the typical FOX Business viewer. But now we’re on the right track.

First, we should recognize that the distinction between a conversational definition – which can be used to spark further discussion and a definition for reference sources like Wikipedia. I’m going to focus on the former case, which is still tricky territory to navigate. It’s like trying to explain what an Operating System does outside of technical circles. Is it important to explain issues like threading, memory management or job scheduling? No; I believe users want to understand what an O/S lets you accomplish and not how it does it. We should hold a definition of mashups to this same principle.

To paraphrase John Crupi’s (JackBe’s CTO) remarks at CeBIT, the definition of enterprise mashups is of little use if it doesn’t communicate their value to the business. To that end, here’s how I explain enterprise mashups when asked:

Enterprise mashups unleash the information locked in a company’s systems and the creativity trapped within its employees to allow anyone to quickly meet specific business challenges.

As I said – this is just a jumping off point. I normally follow-up this statement with a series of questions: “Have you ever used an application that would be perfect if it had just one small change?”, “Have you ever had to wait forever to get the tools you needed to do your job?”, and “Do you constantly waste time cutting and pasting from different applications to get your work done?”

I don’t want a short definition (no matter how “technically correct”) that is obtuse and intimidates users. The key is to pique the listener’s curiosity and draw them in. Once they share their problems I can help them understand how enterprise mashups can help.

Mining the Deep Web with Mashups

I have an introductory article on how to mine the Deep Web (and Deep Intranet!) with mashups over at InformIT. The article is a good introduction for the layperson who may be unaware of how mashups can be used in this manner.

I have to point out one small mistake; the article mentions that the Presto platform (from JackBe) is able to implement the API Enabler pattern through their partnership with Dapper. In fact, their in-house developed EMML (Enterprise Mashup markup language) also makes this possible (w/o needing any Daps).

Twinsoft’s Convertigo platform (not mentioned in the article) is also capable of implementing API Enabler. I’m sure there are other tools I am missing that can create a SOAP or REST api against a presentation layer (either by screen-scraping or the more elegant technique of parsing a page’s DOM). New tools seem to be popping up almost daily! The take-away is that mashups can do more than combine disparate sites together. They can also extract data “at the glass” when a public interface isn’t available (or doesn’t expose the specific information you’re after)

“An Introduction to Enterprise Mashups” webex + Free Chapter 1 Download

This week I had the pleasure of visting JackBe headquarters in Chevy Chase, MD and conducting a joint webex with CTO John Crupi. For those who might not know, John is a well-established figure in the Patterns community having co-authored Sun’s Core J2EE Patterns book. John had the foresight to register more than a year ago and graciously donated the domain to me last year.

JackBe has two case studies in Mashup Patterns, but the purpose of this webex wasn’t to showcase these or their Presto Platform – it was an educational event designed to explain the evolution of Enterprise Mashups, present some of the patterns, and demo a few implementations JackBe had created.

What was more impressive than the record-breaking attendance we received was the quantity and quality of the questions raised during the talk. In fact, so many good issues were brought up that it’s taken us a few days to put all the answers together! The questions demonstrate that people are really starting to understand the promise of Mashups within the enterprise and are thinking about the right things.

As an additional bonus, JackBe secured permission to distribute the first chapter of Mashup Patterns for free. If you want to come up to speed on this new paradigm quickly (or know someone who does) this is a unique collection of great resources.

The video, question/answer archive, and Chapter 1 are available at:

  • Lob Address Verification
    Lob Address Verification API is a fee based cloud API that verifies addresses to confirm deliverability potential. This API performs cleansing, standardizing, and local geocoding of submitted addresses, both domestic and foreign. The Lob Address Verification API supports UTF-8 encoded characters to ensure worldwide verification ability. Domestic verification is free and verification of over 150 […]
  • Lob Simple Area Mail
    Lob Simple Area Mail API is a fee based cloud API that allows businesses to prospect entire delivery areas or zip codes to grow customer base. This API service does not require a contract, allowing single or recurring mass mailings. Lob API requires an authentication key for the creation of bulk mailing jobs that are […]
  • Lob Simple Check Service
    Lob Simple Check Service API is a fee based cloud API that automates check payments for businesses and developers. This API is scalable and customizable, allowing businesses to add branding logos and descriptive content on the fronts of their bank grade business checks . Lob Simple Check Service API requires an authentication key for the […]
  • Lob Simple Postcard Service
    Lob Simple Postcard Service API is a cloud based API that scales postcard production to business specification and allows personalization through dynamic content . This API creates on-demand printing of postcards and features a tracing feature that follows the location and arrival time of individual postcards, which is included in the cost. Lob API requires […]
  • Lob Simple Print Service
    Lob Simple Print Service API is a cloud based API that automates business print workflow. This API creates on-demand and recurring customer communication, such as invoicing, postcards, and greeting cards. Additionally, services such as automated production of business cards for new employees, posters, photographs, and mailing services are accessible through this API. Lob Simple Print […]
  • Appbase
    Appbase is a realtime graph data store for rapid app development. The Appbase API uses REST principles. Using HTTP calls, developers can list and create vertices and edges, filter vertices by specific properties, retrieve vertex data properties, delete a vertex or an edge, and scan through vertices that are RESTful. A non-RESTful API component is […]
  • Adidas miCoach
    miCoach is a fitness training platform supported by Adidas that acts as a coaching service. Users are able to create plans or chose pre-made training plans to plan and track workout progress. Users can access a workout calendar, receive feedback, and view fitness analytics. miCoach is an open platform that encourages 3rd party developers to […]
  • TowerData Email Activity Metrics (EAM)
    TowerData provides services that access business data associated with email addresses. These services retrieve demographic information for individuals or households, location and ISP. TowerData services for email marketing include email intelligence, email validation and email append. TowerData Email Activity Metrics (EAM) API is used to gauge the email address legitimacy based on when and how […]
  • F9Analytics Lease Optimizer
    F9Analytics, owned and operated by Codeworks, offers financial analytics tools for commercial real estate. The F9Analytics Lease Optimizer employs an algorithm that considers metrics such as lease term, property value, start rate, escalations, and more data to determine an optimized lease that reaches long term financial objectives for a tenant and landlord. The F9Analytics tool […]
  • TowerData Personalization
    TowerData provides services that access business data associated with email addresses. These services retrieve demographic information for individuals or households, location and ISP. TowerData services for email marketing include email intelligence, email validation and email append. TowerData Personalization API allows specific and targeted data to be retrieved from emails, hashed emails, or postal addresses through […]