- Mike Bennett
- John Nowlin
- Elisa Kendall
- Anthony B. Coates
- Jefferson Braswell
- Maxwell Gillmore
- Pete Rivett
1) Use Case reminder
2) Where we are on our road map.
3) Open Action Items
4) JIRA Issues Review - https://jira.edmcouncil.org/projects/DER/issues/DER-10?filter=allopenissues
5) Todays content discussion.
6) For next week.
20181113 FIBO DER FCT
Repeat of the Tree Shaker presentation. We only got half way last week. DA also showed this to Brian Jacob of Data dot World. Gave some ideas that would entail a re-build but does not change the slides. There is a slide deck. The following notes are only for additional notes and Q&A on this.
DA gives background: how FIBO is intimidating and imports lots of things one either does not want or is not aware of the relevance of. Tree shaker starts with the parts of FIBO you are interested in and finds the relevant pieces. SM is the OMG Specification Metadata ontology. The OMG uses the short abbreviation sm whereas FIBO uses the long 4-part abbreviations. Similarly SKOS etc. use short namespace abbreviations. Why rel-rel imports agents: includes stuff like designates and appoints, that reference Agents. Meanwhile the terms we needed for FiancialDates did not need those.
The import of plc-loc (Locations) is based on an OWL import that no longer represents a dependency. So that import can be removed right away. Meanwhile we split rel-rel into 2 pieces: one with the 2 properties we needed for dt-fd and the rest of rel-rel. The 2 we need do not themselves depend on other properties. Similarly Arrangements can be split. Could also split up AV but there is no benefit in doing that. But the algorithm does that anyway. See slide 'Break things up' for the result. In this case, the extracted versions of rel and arr don't require any further imports. So you no longer see the imports of Agents and the like. So this works out well for FinancialDates. This is where we got to last week.
JN: Going through FIBO, it looks good but if we could have a view of Swaps and Futures without all the other stuff (mortgages etc.), would like to show just that. Navigating through all FIBO is hard. JN: Also want to be able to see e.g. all Swaps or all Futures, versus specific kinds. JN: At CFTC, you want different views. As above. e.g. specific like just Commodities Swaps, or combined views e.g. Commodities Swaps and IR Swaps, etc. JN: Navigating FIBO is all about the views.
DA: This segues to the next example. Continues with slide deck.
Slide 7: 'Let's Try it for Swaps'. Can't do a diagram for that as it would be too large. Not only the ones listed but the pool of other ontologies that these themselves import. DA: But if you look into these, there are surprising references. See list on slide.
DA: Running the code now, there is a defect. It does not clean this list up as well as expected. DA: is this list surprising to JN? Do some of these listed things actually apply to Swaps? JN: All the things listed are very important. When I look at FIBO, the navigation to find just Swaps is hard; end up with a lot of facts I do not need to know. JN: The list of things on this slide do constitute an excellent sub set of Swaps.
DA surprised, since some of these turned out not to be in the list of dependencies for Swaps as modeled. EK: A lot of those things are relevant to Derivatives. Maybe not IntroducingBrokers. JN: All of these are relevant. EK: Most of these were in her spreadsheet and in her mapping. Suggest to DA one of the points of reference should be the mapping she did for CFTC. DA could compare that to the actual dependencies in the model. If some of these are expected and do not show up from the Tree Shaker, this shows an issue with the model. Inheritance is also covered in the TS. DA So this is another use of the Tree Shaker, to identify things not linked in the model that ought to be.
DA a lot of the things in Slide 7 do get factored out by the Tree Shaker. Jeff Braswell "Swaps" is a higher level class than "Interest rate swap". JN: From a simplistic view, the different views of FIBO that would help the CFTC , such as the Swap view, the Futures view, the Swaps and Futures view, would be useful.
DA: We need to know which things we do expect in the view, e.g. in the case of Slide 7, the things that (all) Swaps bring in. Need JN intuition of whether there are any things in this view that are not expected. Underlier applies to Swaps since most Swaps have one, including IR Swap but not e.g. CDS (MB says it actually does).
TC: What content do we expect to not be removed. Given that Underlier is not a good example of that. PR: What was your start point here? DA: The ontology called Swaps.
JN: From CFTC perspective, we have a certain view we want to see. If we want to get down to the Swaps version we have to navigate down 4 or 5 levels to get there. JN: The aim is not about FIBO but about changing the views. DA is not talking of changing FIBO but about identifying which things belong in a given view by filtering.
PR: By starting from Swaps you are starting at quite a general ontology. More detailed types, e.g. IR Swaps would be more specific. Maybe by starting with Swaps you are missing things related to more detailed things. Wrong start point?
DA using Swaps would also filter out e.g. mortgages. Jeff Braswell: Aren't all dependencies under a given node implicitly in a 'complete' view ? Perhaps we are talking about a constrained view, or a filtered view ? DA: You do not miss out the details of the more detailed ones.
MB wrong, you would not get the more specific ones since the imports don't go that way. So PR is right. MG: the problem is as much one of 'what do you want to drill down on'. At one level of consideration, all these things would be relevant to a swap. In another context, e.g. financial implications of a swap, there may be things you would not be interested in. Otherwise all the things listed can be relevant to swaps. DA confirms that Slide 7 is not all the things picked up in the import tree but a selection of ones he thought were irrelevant. Jeff Braswell: Equivalent to a SPARQL query with qualifiers and filters, no?
JN confirms MG point. If we move this into a triple store you have different views for different analysts. Hope that FIBO can not create an entire triple store with all the data but have multiple triple stores for different contexts, including the kinds of combinations he suggested earlier. Ability to organize triple stores for specific context. JN want to have triple stores that don't contain data that is not necessary. Potentially develop triple stores on the fly for static reports on a given subject.
Jeff Braswell: Can you really segregate triple stores like that? JN this is preferable to a triple store created from all of FIBO. Jeff Braswell: You now have the problem of multiple/redundant copies of the same data, which becomes a data integrity problem. DA: for Tree Shaker, identify the things that something actually refers to. But per MG point, for some contextual things, don't want all the things, but another view would. So the Tree Shaker is probably not the right solution to this problem. Hearing a requirement for something else, that the Tree Shaker does not do.
TC: Sounds like you might want to be able to specify a starting point as s single class and say 'find everything connected to this but exclude these properties and classes' thereby pruning to something manageable. TC. the alternative would be number of hops but that is probably not good enough.
PR in Data dictionary, but rather than a Boolean we want something like shapes, to pull up what is relevant to a given usage context. DA at present the DD starts at a given class and uses particular (Boolean) filters. These are clearly not doing the job. So, using shapes means we are in a position of needing a lot of import for given views / contexts. That is, not simply here is all IR Swaps, but here is a shape that defines my view for a given context. Not clear how we would develop that.
MG: The Shaker concept is isolating relevant topics. Can go beyond that. Start with IR Swaps and a minimum of things that are relevant. From this starting point, then need to filter for relevance e.g. what is interesting from a given POV. then when I get to Counterparties, I don't need all about that, don't need all the LEI ontology, I just want a name. In another context e.g. Credit I may want more details about those counterparties. MG: So start with the Tree Shaker to get the relevant concepts and then drill down on that to expand on this or that concept. What if I can click on the next concept, run the TS again on that. So I narrow the scope.
JN: MG is spot on. For CFTC, we have people looking at e.g. oil futures. The will want to look at commodity swaps and commodity futures, download that into a triple store to do their analysis. At present what you get is too much for them to identify what they need for their analysis. Need to provide a view of e.g. Commodity Swaps and Commodity Future, create triple store from that. JN: also even deeper e.g. for Commodity Swaps I also want Energy Swaps, Energy Futures sas well. So downstream things (not OWL imports) also needed.
DW: Where do we go from here?
48 DA: From what I have learned, question is what belongs in the view and what does not. The
automatable part is to look at the things that are referenced. Context is important. Will not get that from automation. MB we could do this if we had 'mediating thing' contexts in the ontology itself, using SPARQL. However this does not include the kind of traversal based specification of a user's individuals contextual usage, as described by MG.
DW there are 2 opposed viewpoints: people who what to see what FIBO has on a particular instrument type to compare with what they think, versus extracting all that FIBO has.
JN describes a use case: CFTC looking at swaps in OIL future. So we have swaps in Oil futures and we have futures/Oil futures. Want to import that into a triple store and look at that. Then we want to look at something else,e.g. all energy futures. For this we would download all the energy futures and all energy swaps, create a triple store on the fly. Don't want to download all the SDRs or all the exchanges, as this will overwhelm us.
JB: One issue with these operational subsets, is that to the extent that data is shared and referenced, you will have redundancies and copies of data that will confuse in term of data currency. May want to do more ad hoc - extract but not create persistent data stores.
MG: there are 2 levels of interest: the tree shaker pulls out your metadata. Then what JN describes is filtering specific instances. So, I might be interested in a company's exposure to EUR and so extract all txns that are relevant to EUR. That's just a filter / query (SPARQL, like SQL, 'where currency = Euro'). So there are places where JN wants to filter that way, and there are ways where JN wants to filter on the metadata. So this is 2 things.
TC: also JN seems to want to filter not FIBO but data described with reference to FIBO. MG agree, but he is also looking to isolate to a sub set first. JN Yes to both. When we take FIBO and put in a triple store there are a lot of things we don't care about. Rather than create a TS with info we don't need, get it into a format that we can use. JN: For example Swaps without everything else, Futures without everything else. Leave out the ancillary stuff we don't need.
Jeff Braswell: wouldn't you (structurally) need all things referenced by a given node, even if you are extracting datasets ? MB seems like 2 things: what the Tree Shaker does PLUS including ontologies that contain classes and properties that are subClassOf or subPropertyOf the ones at the start point. Jeff Braswell: when/where would you impose "off-page/off-graph" connectors or references to more static data, for example? Jeff Braswell: if one did not include all referenced sub-classes
DA summarizes: the issue is knowing what is needed as a start point, with the Tree Shaker plus something more sophisticated. So we end the meeting with agreement on the question but not the answers.
DA we have not looked at the output of the Tree Shaker in today's discussion. Additional stuff is the shapes to be specified for different context users. But start with the dependencies. As we have done with the Data Dictionary. But the DDs only go a couple of levels deep, which is not very satisfying. How do we find the sub set of things JN is interested in.
JN: What would be neat is if w said 'Here is FIBO for just Swaps'. This is important because we can create a triple store for that. JN: Also be able to combine e.g. FIBO-for-Swaps plus FIBO-for-Futures. Jeff Braswell: if you extracting a dataset in the form of, say, a CSV for import into another database, you don't need all of the ontology class references, but if you are loading into a smaller triple store, you would
JN: We are just starting off. If you give us an ontology for just Swaps we can get up and running. When you give us all of FIBO it is too much for us to figure out for ourselves what we need for Swaps, what we need for Futures etc.
Jeff Braswell: there is a distinction between selecting a subset of metadata (graph structures) versus a subset of data instances
TC: We need to go into this more. Not the same as creating a relational database from a schema. Need to talk more to isolate what is good or bad here.
JN: difference between selecting a sub set of the graph and selecting a sub set of the data. If you will import into another ontology you need all the references to not break the ontology, without needing the data. Jeff Braswell: a data vector, not a sub-graph
EK: Another possibility is closer to what we did with the mapping: here are the mappings that represent a minimal set of concepts needed to produce a given report. Then use the Tree Shaker to winnow out those things that are not in those mappings. DA: then we can take what MG was saying about context-specific views; these take a lot to specify, but in the mapping, you are already doing a lot of that work, e.g. you start from what is needed for a given report. So already specified what parts of FIBO you need. But you can't include only that, since you need to know what parts of FIBO you need. That uses the Tree Shaker.
Jeff Braswell: intersection of desired elements with graph dependencies. EK: Can use the mapping to feed the Tree Shaker. EK working to update the mappings, which were getting out of date. DA: Wells Fargo project has a more systematic way of specifying these based on an entire workflow. EK: Yes, we should be using that in FIBO from now on.
ACTION: EK and DA: Show the WF mapping process if that is allowed.
DWiz: At this point Deans hotspot failed as he was coming to some conclusions. Clearly this topic is not over for us.
ACTION: DWiz Put the Tree Shaker on the FPT agenda.