Archive for March, 2009

Mining the Deep Web with Mashups

I have an introductory article on how to mine the Deep Web (and Deep Intranet!) with mashups over at InformIT. The article is a good introduction for the layperson who may be unaware of how mashups can be used in this manner.

I have to point out one small mistake; the article mentions that the Presto platform (from JackBe) is able to implement the API Enabler pattern through their partnership with Dapper. In fact, their in-house developed EMML (Enterprise Mashup markup language) also makes this possible (w/o needing any Daps).

Twinsoft’s Convertigo platform (not mentioned in the article) is also capable of implementing API Enabler. I’m sure there are other tools I am missing that can create a SOAP or REST api against a presentation layer (either by screen-scraping or the more elegant technique of parsing a page’s DOM). New tools seem to be popping up almost daily! The take-away is that mashups can do more than combine disparate sites together. They can also extract data “at the glass” when a public interface isn’t available (or doesn’t expose the specific information you’re after)

“An Introduction to Enterprise Mashups” webex + Free Chapter 1 Download

This week I had the pleasure of visting JackBe headquarters in Chevy Chase, MD and conducting a joint webex with CTO John Crupi. For those who might not know, John is a well-established figure in the Patterns community having co-authored Sun’s Core J2EE Patterns book. John had the foresight to register MashupPatterns.com more than a year ago and graciously donated the domain to me last year.

JackBe has two case studies in Mashup Patterns, but the purpose of this webex wasn’t to showcase these or their Presto Platform – it was an educational event designed to explain the evolution of Enterprise Mashups, present some of the patterns, and demo a few implementations JackBe had created.

What was more impressive than the record-breaking attendance we received was the quantity and quality of the questions raised during the talk. In fact, so many good issues were brought up that it’s taken us a few days to put all the answers together! The questions demonstrate that people are really starting to understand the promise of Mashups within the enterprise and are thinking about the right things.

As an additional bonus, JackBe secured permission to distribute the first chapter of Mashup Patterns for free. If you want to come up to speed on this new paradigm quickly (or know someone who does) this is a unique collection of great resources.

The video, question/answer archive, and Chapter 1 are available at:
www.jackbe.com/news_events/mashup_patterns.php

A re-cap of last week’s CeBIT panel on Enterprise Mashups

Last week, I attended the CeBIT conference in Hannover, Germany as part of California’s partnership with what is one of the world’s largest technology conferences. CeBIT partners are typically countries (recent ones include France and Russia); California is the first state to be individually selected – not a surprise given its global contributions to innovation.

California’s participation brought a broader focus then the traditional “trade show” and “digital marketplace” attitude typically prevalent at CeBIT. Whereas normally the show is about the latest “physical” advances in things like consumer electronics, server hardware, security and access devices, etc, this year there was a focus on the “application development” side of technology. There was a large area devoted to open-source, and a number of panels devoted to general education on topics such as Virtual Worlds, Digital Content Distribution, and the panel I moderated titled “Business Solutions with Enterprise Mashups” (pics)

We had interesting group. John Crupi from JackBe and Rene Bonvanie from Serena are from firms whose tools are among the most common ones associated with enterprise mashup environments. Olivier Poupeney’s firm DreamFace supplies a UI framework that has been called a “souped up” iGoogle. This framework also stands behind another mashup tool, Twinsoft’s Convertigo. Stefan Liesche (from IBM’s Portal team) was able to speak to IBM’s work in the space. And Ludmila Radzevich-Vorobiova from Apatar (provider of an open-source data integration tool) was on hand to represent data mashups.

I had intended to provide a full transcript, but technical issues with the audio quality unfortunately prevented this. However, there are a few important issues that I think the panel addressed that will be of interest to readers. I’m listing the questions below with some of the responses from my notes:

Can you explain the differences between consumer and enterprise mashups?
This was an introductory question handled by Rene and intended to draw the audience into the topic. What was interesting was that Rene almost immediately introduced the idea of governance, something that I’ve written about recently. I don’t question Serena’s commitment in this area, but their humorous ads and viral internet videos have historically stressed user-empowerment over this concern. That particular viewpoint could make some firms nervous about empowering their employees. Rene’s remarks about the role of mashups within the enterprise – and how they need to be managed- were welcome.

What is the state of mashups as a recognized discipline within enterprise IT departments?
John Crupi offered that successful mashup implementations are going to require the involvement of IT to succeed. So the idea of unsupervised end-user mashup creation is already a bit antiquated in his opinion, too. John is a good authority on IT best practices because he also co-authored the book Core J2EE Patterns while a Distinguished Engineer at Sun. According to John, mashups are also the “last mile” in successfully leveraging SOA environments that many firms have spent the significant time and expense building out over the last few years.

With Stefan Liesche from IBM’s portal team sitting close by, I asked Oliver from Dreamface:
With the emergence of mashups in the enterprise, is the traditional corporate portal no longer relevant?
Oliver’s feeling is that portals will continue to exist as a mechanism to provide common information to a broad community, but custom user-created solutions that remix existing data into highly personal solutions will become more common. It’s the long-tail applied to portals. A corporate portal may provide access to the 20% of information that affects all users (reports, company meetings, etc) but tech-savvy business people will create their own information channels.

The rest of the panel (Stefan included) noted that many mashup tools produce JSR-168 compatible portlets and can be an easy way to integrate existing systems into existing portal infrastructures.

Although we had established the thread of a nice conversation regarding the state of mashups and their impact on traditional IT, I couldn’t resist the urge to change the pace somewhat with Stefan. I had to know, Is the mashup space now mature enough to attract IBM, or does IBM need mashups given the breadth of its offerings?

Stefan explained that “mashups” are a logical extension of the connectivity IBMs customer’s desire, and that it’s natural for them to integrate new products (like IBM Mashup Center) with the other areas of the IBM platform. IBM has many facets that explore a wide range of technology, but I interpreted this as, “Yes – mashups have reached a point where we are committed to making them part of our stack”. IBM may have been “playing around” with mashups with the late, lamented QEDWiki, but with IBM Mashup Center they are getting serious attention within Big Blue.

Is the value of mashups as supporting data to standalone applications (as part of a larger solution), or is non-visual data integration valuable enough in itself?
This was the question I posed to Ludmila from Apatar. The obvious answer is, “Both”, but Ludmila provided some insight that reinforced what the rest of the panel had already begun to focus on. IT can use data mashups to complete their own data integration tasks, but they are also a valuable mechanism for giving teams the raw materials they need to create their own solutions (via the tools of the other panelists, for example)

The general discussion seemed to indicate vendors are coming to a mutual understanding of how mashups will integrate into corporate environments. This was not the picture a year ago when some camps were predicting the demise of IT while others merely promised more rapid development cycles. I think this is partly due to the economic conditions that have evolved over the last 6 months. “Risk” is an unpopular concern right now, and completely re-writing the IT engagement model is fraught with potential hazards. A closer partnership between developers and end-users seems the easiest way to introduce mashup technology without rocking the boat. But with decreased budgets, mashups still face an uphill battle. No matter how promising the technology, many firms will not spend in this area unless clear benefits can be shown. John Crupi explained this best when I asked, “On a scale of 1 to 10, with 10 being the most important, what is the value of enterprise mashups?”

John’s answer is a prudent way for judging the value of any emerging technology. “Mashups for the sake of mashups?” he said, “Two. Mashups with a clear business case and ROI (return on investment) attached? Nine.”

Mashups Re-write the Rules of Software Development

I could have called my last post “The Virtuous Circle of Blog Posts”. It’s kicked off a lot of interesting discussion, particular over at the CIO Weblog (here and here). You’ll see that I responded to Scott’s initial post down at the bottom.

I’m not opposed to “innovation without permission” (as Kevin Parker at Serena Software once described mashups to me). In fact, some of my most successful projects began as “skunkworks” efforts that I worked on without a clear mandate. However, these projects eventually “graduated” from the proof-of-concept phase into legitimate applications once people came to depend on them.

Perhaps in my guest post over at JackBe, I should have been clearer regarding the expectations and requirements of an enterprise versus that of an individual. The code I write for myself at home is in a completely different universe than what I write at the office. I can take risks, cut corners, and tolerate bugs and a messy interface because there is one user: me. And the consequences of these poor practices affect one person alone: me. At the office however, even if I am the only user, my actions take place in a much larger context – that of my employer. The repercussions of a problem can cascade out to have far-reaching effects like ripples from one small stone thrown into a pond .

Does this mean IT must remain firmly in control of mashup-powered efforts? I don’t think it could even if it wanted to. The perfect storm of ubiquitous information, open APIs, powerful tools, and a new generation of technology-savvy users is radically changing the solutions development landscape. As I wrote in my reply to Scott:

  [The advance of mashups] is similar to the rise of “do-it-yourself” home centers. I am sure that a large number of the people I see in Home Depot go home and build complete garbage. But – it serves their needs. Maybe the bookcase looks horrible and will fall over when someone bumps into it – but it solves their immediate problem and they feel a sense of empowerment having built it themselves.BUT – at some level a minimum level of quality that is assured by the “provider”. The lumber isn’t rotten, the screws don’t bend, and the paint is lead-free. This is how I see enterprise IT’s new role in the emerging mashup landscape – delivering the working parts that help mashups work. 

 

Quality control can also take place after the fact. What if, after you built that bookcase, someone from Home Depot came over to your house and checked your work? Sort of like the walkthrough a new building gets before the owner will be given a Certificate of Occupancy. A professional’s careful analysis can help ensure that you don’t accidently wind up getting hurt. This is another role IT may play in the emerging mashup environment. Let people create ad-hoc solutions, but give these products a quick health-check before OK’ing them for general use.

One thing is certain; mashups represent a fundamental shift in programming practices the likes of which have not been seen since development shifted from mainframes and dumb terminals to local desktops. I think most firms haven’t yet begun to appreciate the magnitude of this transformation. Trying to either shoe-horn mashups into existing methodologies or throwing out all the rules are not solutions; they are extremes. We need to recognize we are in a period of fundamental transition and nurture the growth of this new discipline by supporting both freedoms that make experimentation possible and controls to limit liability and risk.

  • SpreadSheetSpace
    SpreadSheetSpace uses REST API to allow the user to link Excel sheets online. This app allows transformation of Microsoft Excel into a live data analysis tool through linking it to corporate data in a secure and controlled way. Due to the PKI encryption, which allows full privacy and selective sharing of Excel cells, the API is served over HTTPS , therefore […]
  • Diffbot Analyze
    Diffbot provides developers tools that can identify, analyze, and extract the main content and sections from any web page. The Diffbot Analyze API can analyze a web page visually, and take a URL and identify what type of page it is. Diffbot’s Analyze API can then decide which Diffbot extraction API (article, discussion, image, or product) may be appropriate, […]
  • Diffbot Discussion
    Diffbot provides developers tools that can identify, analyze, and extract the main content and sections from any web page. The Diffbot Discussion API extracts discussions and posting information from web pages. It can return information about all identified objects on a submitted page and the Discussion API returns all post data in a single object. The Diffb […]
  • Diffbot Image
    Diffbot provides developers tools that can identify, analyze, and extract the main content and sections from any web page. The purpose of Diffbot’s Image API is to extract the main images from web pages. The Image API can analyze a web page and return full details on the extracted images. Date Updated: 2014-10-20 Tags: [field_primary_category], [field_second […]
  • Crowdfunder
    Crowdfunder is a UK based platform where people can crowdsource funding for unique projects. Crowdfunder projects typically involve social endeavors related to community, charity, environment, art, music, publishing, film, and theatre. Currently in an open beta, HTTP GET calls to the Crowdfunder API can be made to request JSON lists of all current campaigns […]
  • GlobalNLP
    Via RESTful connectivity, GlobalNLP handles a wide variety of natural language processing. Currently, the API supports many NLP processes including: stemming, morphological synthesis, word sense disambiguation, entity extraction, and automatic translation. A full list of supported processes is listed in the documentation along with code samples in JavaScript […]
  • VIDAL Group
    VIDAL Group is a French healthcare informatics group specializing in databasing and distributing healthcare data, pharmaceutical information, treatment specifications, and scientific publications for patients and healthcare practitioners in the European continent and worldwide. VIDAL Group also supports a medical software application under the same name. VID […]
  • Company Check
    The Company Check API provides direct access to a wealth of information on companies and directors. The API platform is useful to developers to incorporate company, director, financial, credit data, and many more data fields into software and business apps. By applying for the API Key, developers can choose between different levels of account plans. Date Upd […]
  • bx.in.th
    bx.in.th is a Thailand-based Bitcoin and cryptocurrency exchange platform operated by Bitcoin Exchange Thailand (Bitcoin Co. Ltd.). Their API accessibility is divided into Public and Private. The bx.in.th Public API allows anyone to view market data from the exchange, including rates, orderbook, currency pairing for comparison, high and low trades, average B […]
  • Coinzone
    Based in Amsterdam, Coinzone enables European online retailers and eCommerce providers to accept digital currencies such as Bitcoin instead of traditional payment methods. Using the Coinzone REST API, secure calls can be made to authenticate, initiate transactions, retrieve transaction details, and process refunds. Authentication requires a client-code, time […]