2009 VDC Cyberinfrastructure Project: 2009

Wednesday, July 15, 2009

Oh yeah, trying to implement the RESTful client is proving to be difficult...The addressing.mar file is not being read - sender.engageModule(Constants.MODULE_ADDRESSING); is the problematic line - commenting it out seemed to work, but there might be some unforeseen error that Im not considering. Probably shouldn't try to figure it out this late!!!!

Tuesday, July 14, 2009

July 14 Summary

After experiencing frustration with the wsdl2 data binding this morning, I decided to mess with the jena code to try and get the lowering process off the ground at least. I think I was able to fix the jena code - apparently I was using the wrong "QueryExecution" object. I ran some sample SPARQL queries over rdf data obtained from the nexml2cdao xsl and it seemed to work ok.

One particular issue I should probably raise with Rutger, Hilmar, and/or the TreeBase team is the base URI for the relative terms in the nexml2cdao output rdf. Im not sure what this should be - for now I have a temporary base URI because jena will complain if it doesn't have a base (the rdf:IDs contain relative URIs). Im actually pretty sure this isn't a big issue as only the resulting xml matters...

Now I need to get figure out the correct SPARQL query which I don't think should be difficult for the simple Matrix service - all I need is to extract the Treebase ID from the rdf.

Commits:

The revised nexml2cdao xsl that now includes

Tomorrow, I absolutely have to make some progress with the data binding (I think I procrastinated a little today by focusing on the jena code)

Friday, July 10, 2009

Not much to report this week - I have been busy at ICWS (two talks and chairing a session). Looks like Ill have to put in double time next week...

I really think the functionality of the service needs to get done soon (at the latest in a week and a half), as I believe getting the lowering schema to work will be a challenge. The functionality is challenging as well, but doable in that time frame. The libraries can be accessed, its just a matter of getting the schema on the wsdl2.0 interface to conform... Ill have to get started on this as soon as I get home.

Friday, July 3, 2009

A lonnnggggg week...

Just got through with the transform folding the nexml triples with the metadata triples...took longer than expected because of ONE line that I commented out while debugging. Go figure...

I started populating the svn with a structure - by no means is it final but its a start. I have a lot of dependency files and should probably create an ant or maven build manager to get them all together. I will attach README files in all appropriate directories. Dependency files will exist in the clients and services directories. Im not going to populate it with my code until Sunday night when I leave for LA.

Regarding the client, I have a SAWSDL parser in place, extracting the appropriate sawsdl attributes for modelRefs and schemaMappings at different locations of the WSDL2.0 document. In the next few days, the client should be able to lift both element and input tags via the extracted xsl tags and the (start of) the jena-driven rdf operations (for the sparql) .

The service right now is just a POJO with bean classes that are meant to mimic the structure of Nexml. I tried previously to put it in the service, but got some errors. I think they are easily fixable (I can place a jar in the lib directory of the service .aar file), but leave that for when I come back. There is a clear procedure I have had to follow when creating the service and its WSDL2.0 containing the sawsdl markup. These procedures will be discussed in readme files and hopefully automated by maven/ant.

Wednesday, June 24, 2009

6-24 summary

Well, looks like Woden4SAWSDL is just as spaghetti as SAWSDL4J and has significantly less support for it. I can only follow the test cases (most of which do not work - at least for me). I am supposed to meet with Doug Brewer (one of the architects) Friday so I hope he can shed some light on this for me.

In the meantime, I will just continue with building the RESTful WSDL doc...

6-24

Haven't been able to work on anything today as of yet (my advisor and I have been hammering out another paper due tomorrow).

Last night as I was going to bed, I was able to get a client working (barely). If I can get that done soon, then tonight and tomorrow, I think I will go back to the Woden4SAWSDL API, play around with it, and apply it to that example WSDL REST service tutorial. The goal is to have the client be able to access the ModelReference and SchemaMapping extensibility elements in the SAWSDL (WSDL2). We'll see...

6-23 Summary

What a great tutorial! Im understanding this whole REST shoehorned into WSDL thing a lot better. I think Ill just use this tutorial as a template from here onwards. After reading, Im optimistic I can make my half-point goal. Hopefully by the end of the week Ill have some code to post.

Tuesday, June 23, 2009

6-23

Found a good tutorial late last night...

http://wso2.org/library/3726

Will try to work on that today

Thursday, June 18, 2009

6-17 (and early 6-18) summary

Yes, its really 4:10 AM! I have been trying to work with sawsdl4j ALL day and I keep coming up with new issues. I have to resolve this soon as this is probably the most fundamentally crucial part of the project! I think I can get it with just a fresh set of eyes and a little advisement from my friend Doug who worked on it. Lets see if I can get this done by the end of the week. Good night world zzzzzz....

Wednesday, June 17, 2009

6-17

Scheduled activities for the day

Note:

Can't start anything today until 5 - have a few meetings today with research advisor and committee.

5 PM - 7 PM - more WS stuff

7 PM - 9PM - errands

9 PM - 12 PM - get started on SPARQL (at least for the data - metadata I may need to work through more examples)

Tuesday, June 16, 2009

6-16 Summary

Well, didn't post anything this morning as promised, but I got a late start to the day, and had a few meetings so I couldn't.

Today was more about the WSs than NeXML. I am still investigating the woden converter tool to convert WSDL1.1 to WSDL2.0 docs. This will be key for creating RESTful services down the road.

I am going to check in some code soon - not much but at least to get something on. Im envisioning the following directory structure:

root (trunk)

README.txt //Explanation of the project?

docs //Documents explaining this structure - may be better to have distributed help //documents

tutorials

DeployWS //step by step tutorial on the procedure for building and deploying //WSs

TransProcedure //procedure outlining the process of creating lifting and //lowering schema mappings

Clients //procedure outlining the creation of clients

javadocs //if needed

misc //others as required

environment //the required environment - may include dependencies, batch script, //etc, and certainly an explanation

tests // I plan on deploying/hosting WSs myself for some time - that information will be // explained here

xslt //just like the NeXML directory - will contain the liftingSchema (nexml2cdao) and //various loweringSchema transforms might be needed for specific use cases

spql //required sparql queries

xsd //NeXML (and perhaps other) schemas required

sawsdl //these are basically sample services - functionalities will vary; hopefully I can get //a lot different use cases using TreeBase, CIPRES, etc

//the input will be NeXML in various forms

README.txt //instructions on how to deploy, what dependencies are needed, etc

WSDL1.1 //SOAP services created with WSDL1.1 interfaces

//Includes the .war files

echo //basic echo service

matrix2tree //the first service recommended to me by Rutger - takes a //character state matrix and converts it to tree

//could be a wrapper service around CIPRES:

//http://www.phylo.org/rest/rest_api.php

treebaseUseCase //Need to get more information

misc //other services may go here - maybe i can just go down the list on:

//https://www.nescent.org/wg/evoinfo/index.php?title=PhyloWS

WSDL2.0 //SOAP services created with WSDL2.0 interfaces

echo //basic echo service

matrix2tree //the first service recommended to me by Rutger - takes a //character state matrix and converts it to tree

//could be a wrapper service around CIPRES:

//http://www.phylo.org/rest/rest_api.php

treebaseUseCase //need to get more information

misc //other services may go here - maybe i can just go down the list on:

//https://www.nescent.org/wg/evoinfo/index.php?title=PhyloWS

REST //REST services created with WSDL2.0 - probably future work

Composite // Compositions created with BPEL (a good indicator of the power of // SAWSDL)

clients // corresponding client files of the each of the use cases - will try to include both a // command line and web based client

WSDL1.1 //SOAP clients

echo //basic echo serviceclient

matrix2tree //client for this service

treebaseUseCase //client for this service

misc //clients

WSDL2.0 //SOAP sclients

echo //basic echo serviceclient

matrix2tree //client for this service

treebaseUseCase //client for this service

misc //clients

REST //REST client

Composite // composition client

Monday, June 15, 2009

6-15 summary

Well, once again did more reading than coding. I think thats going to be a common theme this summer. NeXML still remains somewhat of a mystery to me, but I think if I continue reading everyday, I will eventually get it.

That said, the transforms are still separate. I am still having trouble figuring out one example from the java api in particular - the AnnotationTest file. This file outputs annotations marked up by the meta element tag - kind of confusing to me as I had thought the dict element tag will do this. Anyway, this leads to a NeXML that does not validate! I guess I can leave it be for now, but for some reason that still bugs me. I don't think the different versioning is a problem - there may be one construct that I question in the 2.0 but for the most part they can be compatibly combined. Tests on all the examples are a must - you never know if the combination will introduce some other bug. I should include that in the docs.

The basic Web service was easy to build, but the interface is still up in question. Im going to go with a simple scenario for now and study the TreeBase docs (and the spreadsheet distributed the other day) to derive a scenario and then run it by the mentors.

Still frustrated that I haven't gotten much coding done. I know this is a necessary evil, but 3 weeks in I feel like I have to have something to show for the time. I guess the goal should be to get some code by the end of the week - regardless of scenario and/or liftingTransforms, I would like to start on the SPARQL on Thursday.

6/15

Activities scheduled for today:

~9 AM - Update blog, post to vdc listserv, check previous entries on NeXML and TreeBase listservs.

9 AM -11 AM - More time devoted in the two lifting transforms (how to fold both into one)

11 AM-2 PM - Get the WS deployed (at least with 1.1 and try with 2.0)

2 PM - 6 PM - Other errands

6 PM - 10 PM - Related readings

Thursday, June 11, 2009

Rolling Now...

I have finally begun to get on a bit of a roll. No new code to report, more testing than anything else, but I have really gotten comfortable with the NeXML java apis and the various schemas (which have direct correlations as outlined in Rutger's presentation). I also have continued reading all related materials (the wiki pages, listserves, and online forums form an INFINITE maze of information).

I spent some time today testing/observing the behavior of the stylesheets nexml2cdao and RDFa2RDFXML. For the most part, it is pretty straightforward - at least for testing the documents produced from the various test files. nexml2cdao does have that validation error with the ID - it appears to be a simple unacceptable character error (RDF:ID cannot have the character '$' in it). I think I still need another day or so of tests and looks to understand the real trend. The other issue is the versioning issue - we want to combine the two in some understandable way to get our final product, the liftingSchemaMapping stylesheet. Im hoping this can be done by the end of the weekend, because I really want to start "phase 1" of the loweringSchemaMapping (i.e. SPARQL query against the triples).

My last blog entry was somewhat rambling, so (from next Monday forward) Im going to design an organized way of detailing my day to day activities. The daily log will closely mimic the weekly log. Each day will have (at least) two entries:

Beginning of the day - The outline of the day's planned activities and/or goals. I will try to be as specific as possible (regarding times and activities)
End of the day (or the point in the day where I will end time devoted to VDCSoC). Here, I will list successes, setbacks, and a summary of the day (perhaps with a flavor of where I believe I stand in terms of progress)
(Optional) Any sudden discoveries, unforeseen events, outlandish ideas, crazy happenstances, etc

Tomorrow (and the remainder of the weekend), I want to continue on this roll. I feel like I still have some catching up to do. I do have to interrupt it at some point though, as my advisor wants a prospectus revision done soon...

Tuesday, June 9, 2009

Journal Beginning

Well, I haven't posted on here in awhile, but I think I want to start posting everyday. It may help to keep a log on the progression of the project - daily summaries, updates, obstacles, progressions, celebrations, etc...

This past week I attended the VDC/DataOne meeting. I was pleasantly surprised -- it went as well as I could have possibly hoped. I learned quite a bit in my short time there. Sure I didn't know much of what was discussed, but I did get a better idea of the context of my project and am beginning to understand the scope of the grander goals of the VDC/DataOne project. And of course, I met some very smart people who gave some great feedback/recommendations for the future of the project. I will definitely get back to those recommendations sometime soon.

This week has started off a bit slow. After finally checking out the nexml source files from sourceforge, I have been spending much of my time (that is, time devoted to this project) getting to understand it. While the first week, I focused on the java parsers for NeXML (which I think still requires a little more time), I have spent this week investigating the schema and transforms in detail.

This past Friday, I was told to investigate two transforms nexml2cdao.xsl and rdfa2rdfxml.xsl . These are the files needed for the lifting transformation. The problem with these files is that one was created using version 1 and the other version 2 - ideally, we would like both to be used seemlessly without compatibility issues. I believe resolving this issue this week is crucial - as I want to have the WS at the very least have the liftingSchema transform done.

Unfortunately, it wasn't as simple as I thought it was going to be - I was having issues with Oxygen as the examples given in the svn were not validating with the schemas given. Clearly, this issue had to be resolved (embarassingly enough, it turned out to PEBKAC error in my configuration in Oxygen :). Thus, I decided to look in excruciating detail at the xsd files. Im actually glad that I did, as it will probably help in the future. I actually read this in conjunction with my molecular evolution book and everything seemed to make a little more sense to me.

So just now, I have performed the transforms on the smaller, more trivial examples - but the larger xml files were taking too long, or crashing my machine (Im running eclipse on my laptop with limited memory). So tomorrow, I will have to move the testing to my larger machine in my office. I hope I can make up some time on Thursday and Friday (tomorrow I have a couple of important meetings) and play catch up with the original goals I stated in the weekly update. One more goal to add - I want to have a more detailed calendar of future events/expectations.

Well, these really are just a lot of random thoughts thrown together, but I guess that is what a (b)log is all about - I will try to be more concise and understandable with each daily entry. Good night world!

Monday, May 25, 2009

Wow has it been a while...

Nothing like a rant for a first post to a blog! :) I guess its more of a random thought - I'll probably have many of these throughout the project...

Ahh, the irony of research - the more time spent researching a specific topic, the further behind you fall when it comes to knowing current technologies. After doing some necessary reviewing of the recent advances of the PhyloWS group for this project, I have found I have much to learn. For years, I have been studying Web service compositions (specifically the adaptation problem of WSCs in dynamic environments) - a very relevant, useful, and (dare I say) neat idea. The problem is that the support system still uses old (but still working) working standards. I have been so heavily dependant on the BPEL standard for WSCs, which uses WSDL 1.1 exclusively, that I have either overlooked or dismissed the new (well not really new) RESTful style Web service (or WSDL 2.0 for that matter). Sure, I have been curious and have read several papers on the topic, but still used what I had been used to in my empirical experiments in research. I have grown fat and lazy on this style almost to the point of being stubborn!

I still like the WSDL 1.1 paradigm and think it still has a place but I know that many do not and I must adapt accordingly, perhaps starting with this project. Luckily, SAWSDL works for both 1.1 and 2.0, so this can be an easy transition. I am familiar with SAWSDL4J, but I also hear Woden4SAWSDL may prove useful as well. This will give me a chance to both code using the tools for which I am accustomed in 1.1, while gradually moving in the WSDL 2.0 direction (but there is still no . I have been told it is much easier too - we'll see!

That said, we need to stick to SOAP paradigm for now, as adding semantics to RESTful web services is still an ongoing issue...

Sunday, May 17, 2009

Project Beginning

A summer of evolution begins...

2009 VDC Cyberinfrastructure Project