Archive for the ‘patterns’ Category

For the holiday season, get Mashup Patterns for 30% off!

Thanks to the folks over at Techbargains.com I found out about this special deal. It works for other Pearson books too (there are others ;-) ), which covers Addison-Wesley Professional, Cisco Press, Exam Cram, IBM Press, Prentice Hall Professional, QUE and Sams.

Input code “TECHBARG30IT” in step 3 of the checkout process, “Payment Method”. Free Ground Shipping. Tax in most states.

Happy Holidays!

The Biggest Thing I left out of Mashup Patterns

About this time last year I sequestered myself in a rental house on Cape Cod while my wife and parents took my daughters on an endless loop of beach/mini-golf outings. My time was spent finishing up the primary text for Mashup Patterns. For those of you thinking, “But the book only came out a few months ago, how is that possible?” let me just say, “Yes, the publishing process is that complicated”. You can imagine how disruptive eBooks are – but that’s a story for a different time and place.

This year I was back on Cape Cod again, but able to relax a bit myself. It seemed like a good time to reflect back a bit what did (and didn’t) make the final cut. One of the biggest things I left out of Mashup Patterns was a discussion of Screen Scraping. Actually, that’s not entirely accurate. I talk about it a lot in Chapter 1 (“Acquiring Data from the Web”) then again in Chapter 4 (“Data Extraction”) and a little more still in Chapter 4 (“Harvest Patterns”). So how can I say I left it out?

What I talk about specifically is ”scraping” data from Web pages, which I prefer to call “DOM Parsing” since what you’re really doing in traversing the page’s underlying Document Object Model and looking at things like CSS attributes, id’s, names, etc rather than an item’s absolute screen position. “DOM Parsing” is the underlying technique used by products like Kapow, Dapper, Convertigo, and Mozenda.

Why do I bother making this distinction? The reality is that Screen Scraping has a negative connotation in many circles. The old techniques for acquiring data based on X/Y screen position are not well regarded. And justly so; many of us built solutions on top of screen scraping products only to see them fail miserably when a single label was renamed. I wanted readers to know that Web Harvesting was a much more robust and fault-tolerant approach for the twenty-first century.

Have I had any impact? I’m not sure. I’m happy to see sites like Mozenda advertise that they perform Web Harvesting, but of course right above that they claim they do Screen Scraping as well. Argh! Plus, I continue to be on panels and calls and hear the two terms used interchangeably. When a technology is reasonably similar to something a developer has seen before, we have the tendency to use old labels, judge it by previous experiences, etc. This inclination can keep us from recognizing the truly innovative stuff, so it’s a trap we have to watch out for. But I digress; the point I actually want to make is that there is a time and place where Screen Scraping can add value.

I don’t know how many mainframe-based systems with no web front-end are out there, but it’s a stockpile that’s accumulated over decades. A large number, being perfectly suited for their purpose, will continue to linger along unchanged. Perhaps it’s just too expensive to re-write or re-platform them. How do we then leverage these resources in our new creations? For example, how do we take an ancient order-fulfillment system and link it with a snazzy new Salesforce application?

You see where I’m headed. This is a mashup, too. Only instead of the normal cadre of web applications, RSS feeds, SOAP APIs, etc, we want to include mainframe content. And if there are no other avenues in via the database or other feed, then Screen Scraping is a perfectly viable option. In 2004, before the term Mashup was in wide use, David Linthicum’s book,
Next Generation Application Integration: From Simple Information to… talked about this exact approach.

“Leveraging the user interface as a point of information integration is a process known as “screen scraping,” or accessing screen information through a programmatic mechanism. Middleware drives a user interface (e.g., 3270 user interface) in order to access information. Simply put, many application integration projects will have no other choice but to leverage user interfaces to access application data and processes. Sometimes access to underlying databases and application interfaces does not exist.”

David also summarized the all-too-common downsides:

“A user interface was never designed to serve up data, but it is now being used for precisely that purpose. It should go without saying that the data-gathering performance of user screens leaves a lot to be desired. In addition, this type of solution can’t scale, so it is unable to handle more than a few screen interfaces at any given time. Finally, if the application integration architect and developer do not set up these mechanisms carefully, they may prove unstable.”

Nevertheless, sometimes working “at the glass” is our only option. Dozens of companies offer solutions in this space, but only two that I’m aware of (Convertigo and Lansa) have actually connected the dots between interacting with a mainframe and building enterprise mashups. When David wrote his book, it was for an IT audience focused on integrating disparate applications. Today, we realize that besides this lofty goal, sometimes it’s just as useful to mine small nuggets of useful functionality from a system to build something unique and new.

I started this post out with a little reflection on where I was about a year ago. Personally, I think leaving the mainframe discussion out of Mashup Patterns was the right call. But professionally, none of us can afford to ignore the past. Legacy resources are everywhere, and they can easily be incorporated into today’s new mashups. Regardless of the sources underpinning your mashup you need to be aware of their fragility, provide notification and controls for dealing with any unexpected downtime, and incorporate multiple, redundant sources for data when possible. Don’t let a bias against “scraping” keep you from using what may be some of the most valuable functionality in your firm.

What are Enterprise Mashups?

What are enterprise mashups? JackBe is running a contest to come up with a definition everyone can agree on. I think the biggest problem is that people try and define the tools, the goals, and the constituent parts in the same sentence. It leads to cumbersome, wordy, and sometimes narrow definitions like this one:

“Enterprise mashups are integrated business applications that combine data from two or more data sources, including enterprise data sources such as web services, and possibly external web services. The nature of mashups is to provide a quick return on investment using high-productivity tools and techniques. These include AJAX, browser-based tools, and reusing existing web services and web components.” (source http://applibase.com/DataCaster_FAQ.html)

Would a layperson (and potential mashup creator) understand that? It also perpetuates the common misconception that mashups must have more than one data source. Wrong! People assume that prerequisite because “mashups” seems to demand this plurality, but in fact there are advantages to using mashup tools with only a single web site (such as creating an RSS feed or API where non previously existed)

How about this one:

“Mashups are a brute force joining of disparate Web Data, oblivious to the underlying Data Model(s), and often based on RSS 2.0 (a Tree Structure that contains untyped or meaning-challenged links.)” (source: http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid) )

A “Brute force” approach sounds neither flexible nor easy, which is ideally the exact opposite of what enterprise mashups aim to be. Let’s look at one more:

“A mashup is a dynamic web application that brings together data stored in many different applications for better decision making” (source: Luis Derechin, JackBe CEO on FOX Business)

We’re getting closer because this is the first definition that answers the “why” about mashups. Why should I care about these things? Luis undoubtedly knows mashups can accomplish a host of things including streamlining a cumbersome process, pushing content to alternative devices, etc. My guess is he honed in on “better decision making” because he was thinking of what would be of interest to the typical FOX Business viewer. But now we’re on the right track.

First, we should recognize that the distinction between a conversational definition – which can be used to spark further discussion and a definition for reference sources like Wikipedia. I’m going to focus on the former case, which is still tricky territory to navigate. It’s like trying to explain what an Operating System does outside of technical circles. Is it important to explain issues like threading, memory management or job scheduling? No; I believe users want to understand what an O/S lets you accomplish and not how it does it. We should hold a definition of mashups to this same principle.

To paraphrase John Crupi’s (JackBe’s CTO) remarks at CeBIT, the definition of enterprise mashups is of little use if it doesn’t communicate their value to the business. To that end, here’s how I explain enterprise mashups when asked:

Enterprise mashups unleash the information locked in a company’s systems and the creativity trapped within its employees to allow anyone to quickly meet specific business challenges.

As I said – this is just a jumping off point. I normally follow-up this statement with a series of questions: “Have you ever used an application that would be perfect if it had just one small change?”, “Have you ever had to wait forever to get the tools you needed to do your job?”, and “Do you constantly waste time cutting and pasting from different applications to get your work done?”

I don’t want a short definition (no matter how “technically correct”) that is obtuse and intimidates users. The key is to pique the listener’s curiosity and draw them in. Once they share their problems I can help them understand how enterprise mashups can help.

Mining the Deep Web with Mashups

I have an introductory article on how to mine the Deep Web (and Deep Intranet!) with mashups over at InformIT. The article is a good introduction for the layperson who may be unaware of how mashups can be used in this manner.

I have to point out one small mistake; the article mentions that the Presto platform (from JackBe) is able to implement the API Enabler pattern through their partnership with Dapper. In fact, their in-house developed EMML (Enterprise Mashup markup language) also makes this possible (w/o needing any Daps).

Twinsoft’s Convertigo platform (not mentioned in the article) is also capable of implementing API Enabler. I’m sure there are other tools I am missing that can create a SOAP or REST api against a presentation layer (either by screen-scraping or the more elegant technique of parsing a page’s DOM). New tools seem to be popping up almost daily! The take-away is that mashups can do more than combine disparate sites together. They can also extract data “at the glass” when a public interface isn’t available (or doesn’t expose the specific information you’re after)

“An Introduction to Enterprise Mashups” webex + Free Chapter 1 Download

This week I had the pleasure of visting JackBe headquarters in Chevy Chase, MD and conducting a joint webex with CTO John Crupi. For those who might not know, John is a well-established figure in the Patterns community having co-authored Sun’s Core J2EE Patterns book. John had the foresight to register MashupPatterns.com more than a year ago and graciously donated the domain to me last year.

JackBe has two case studies in Mashup Patterns, but the purpose of this webex wasn’t to showcase these or their Presto Platform – it was an educational event designed to explain the evolution of Enterprise Mashups, present some of the patterns, and demo a few implementations JackBe had created.

What was more impressive than the record-breaking attendance we received was the quantity and quality of the questions raised during the talk. In fact, so many good issues were brought up that it’s taken us a few days to put all the answers together! The questions demonstrate that people are really starting to understand the promise of Mashups within the enterprise and are thinking about the right things.

As an additional bonus, JackBe secured permission to distribute the first chapter of Mashup Patterns for free. If you want to come up to speed on this new paradigm quickly (or know someone who does) this is a unique collection of great resources.

The video, question/answer archive, and Chapter 1 are available at:
www.jackbe.com/news_events/mashup_patterns.php

  • Versal Gadget
    Versal is an open publishing platform designed to transform online learning from passive presentations into interactive online educational experiences, without requiring coding experience. Using the Versal platform, teachers and subject matter experts can share their knowledge using a creation canvas, gadget architecture, open APIs, and flexible embedding op […]
  • Ziftr
    The RESTful Ziftr API allows online retailers to accept cryoptocurrencies such as Litecoin and BitCoin instead of normal payment methods. It also allows programmatic access to the Ziftr eCommerce platform and API sandbox. The Ziftr API lets retailers view real time pricing statistics and inventory updates, and allows access to new sales channels, all while k […]
  • data.gouv.fr
    Supported by Etalab, data.gouv.fr hosts the French government's open datasets. The state, local authorities, and private entities pursuing a public mission may upload their datasets as well. Therefore, data recorded in conjunction with a public service mission is the type that may be uploaded into the system for others to browse. The data spans a wide a […]
  • TypeTalk
    TypeTalk is a social messaging and team collaboration service developed by Japan-based nulab. TypeTalk offers a platform for for social engagement with the ability to post, comment on content, like, archive, and create topics & groups for conversations. With the TypeTalk API, developers can retrieve public user profile information, post and read messages […]
  • mTrip
    mTrip develops mobile solutions for the travel industry. The mTrip API can be used by tour operators, travel agencies, and their partners to sync client-side data systems with the mTrip platform. The mTrip API could be used to add traveler's trip information into a 3rd party application, including data on transportation options, activities, accommodatio […]
  • Esri ArcGIS Mapping and Visualization
    With ArcGIS REST API Map Service, users can access content of a map hosted on a server. The site offers visual displays with updates, resource hierarchy, request parameters, and JSON response examples. This mapping service could be useful if developers intent to access functions such as dynamic drawing, query, and search through the server ArcGIS. If they de […]
  • AlchemyAPI Face Detection and Recognition
    The AlchemyVision Face Detection and Recognition API accepts an image file or a URL as an input. The API will scan a photo to detect facial locations and can recognize individuals present within a photograph, such as celebrities. The API documentation divides the API into the Image API, for detecting faces in uploaded images, and the Web API, for recognizing […]
  • ArcGIS: Spatial Analysis Service
    ArcGIS REST API: Spatial Analysis Service helps developers to access, create, and share maps, apps, and information when they become members of the cloud-based collaborative environment. With the Spatial Analysis service, users can summarize data, find locations, enrich data, analyze patterns, use proximity, and manage data. Each category contains license, r […]
  • dandelion dataTXT-SIM
    dataTXT-SIM is an experimental semantic sentence similarity API, optimized on short sentences written in several languages (Italian, English, German, Portuguese and French at the moment). With this API users are able to compare two sentences and get a score of a semantic similarity. It works even if the two sentences don't have any word in common, and i […]
  • Esri ArcGIS Location-Allocation Service
    ArcGIS REST API Location-Allocation service assists users to choose locations from a list of facilities based on probable interactions. The site offers required parameters that include demand points, tokens, and facilities. The website also shows optional parameters that include analysis_region, target_market_share, and travel_mode. Users can access a data c […]