Trust and Internet Identity Meeting Europe
2013 - 2020: Workshops and Unconference

SAML Federation

(Marco Leonardi)

Marco: I am from the European Space Agency and I submitted a topic for another discussion. We would be in the future it’s about starting from the mandate of supporting SAML federations because we really want to consider eduGain as an important source of users because we provide services concerning _____ data and for that eduGain will for sure provide scientists in the community interested in accessing this data. What we will put in place is infrastructures and we are thinking about the arc architecture and inermidable services. You can access here and access something but these services may need to interpret together and provide a chain of services. In this context, what we need is to propagate as much possible user information between the services. There is a single-entry point and the user may also be a student which is the rest of the chain. We need to also make the services able to do things for the user like systematic processing.

Blackbox with an entry point for the user that could allow the user to be authenticated towards different organisations. An entry point that could communicate with IDPs that could consider arch architecture that could process and enrich the user identity and after that the user could have access to the service. In a standard and simple architecture, this is fine. What if S1 relies on S2 that relies on S3 that has to access data in a DB. The problem we have in our use case is that everyone in the chain wants to know stuff about the user because if you want to access data then ok but if you want to access sentiment data then we need to know who you are. Even before the story of receiving this information is how to put this in place.

David: It’s a very well-known problem. This is both delegation as well as time shifting and offline operation backspacing (?).

Niels: S2, 3 are they services? Are they SAML SPs or ODSC SPs or Shell accounts? Let’s make it one step easier, the other scenario is if your last arrow is a relation between servers is to have a portal living on the s3 that collets tokens that will give you access like OAuth, SSH, a certificate, this is what we identified as a part of the arc project.

Mischa: Are all the services in one trust domain?

Marco: In the basic scenario, yes.

Mischa: So it’s easy to pass information around?

David: it’s so simple as you might end up with a golden proxy. YOu probably don’t want that.

Marco: the complete scenario is to cross the boundaries. From a certain POV if you have the same domain you can find any way of passing information, you don’t need to invent tokens or anything. You can also use trust passing information to the others. Let’s think about the basic case where the user accesses with SAML, then the business to business communication is business because it’s something that’s transparent to the user. If this is managed between insdie the same security domain, you don’t have to take care about anything. The point is that what happens if you have to connect the S1 to SX that is inside of another infastructure.

We don’t have things clear because we know that from the POV of rules to follow in order to follow to make this happen is a mess but there are a lot of people working on it. I have a complexity here as these infrastructures are not purely researched infrastructures. They are commercial. We as ISA now promote and support by financing this kind of things but we cannot operate. They are managed by commercial entities.

Niels: From a legal perspective, you guys are the owners of the system.

Marco: We remain the owners of the data.

Niels: But you are the owner of the actual data. One that is pretty well described in the GDPR. You must make sure it’s NASA’s problem (?) If you are hiring amazon to do some compute stuff on your data that would be personal data and it doesnt matter what is in it, you still have the ownership and legal responsibility.

Marco: This is not the point here, it’s like we have something that analyses data that we want to provide to the users, in which way? In the past we did it on our own and it was extremely costly. What do we want to do with this data? It’s not to earn but to provide it to the community. I don’t want to build infrastructure. This is the data, do our c

Niels you are basically asking, going to ftp server provider, you are telling them here is a pile of data, distribute this on our behalf. Making the data available. David: Would a reseller participate to a program and they would have to participate inyour control. Who decides who gets access to the data? Is it you (ISA)? Marco: For sure by ISA. Niels: I think you are the owners of this problem and you can’t outsource it. Marco: In theory, I am expecting to provide requirements. Yes, let’s think about very sensitive data, we don’t want to release this data to people. You do not have to provide this data to these kinds of people. I am the one that released a requirement.

I don’t want to provide requirements withut some indication because I want to stick to standads.

Niels: And also, you want to at least make visible that you did your best.

Marco: What I want is if someone tomorrow joins my infrastructure. I don’t want to spend money for them to implement things.

Niels: How do you establish who gets access?

Marco: The process should remain the same as now, scientists apply for data, they provide information, so for special data we require more projects, institution that could certify but this is done by hand because the scientist doesn’t come with a federated account. It’s not done yet. We are working in the background trying to simplify things by trying to do this kind of things. It’s not simple because of the legacy systems. This is why the user has to register. It’s even more complex.

If you are coming in through, it would be like REMS. The servers would have to learn the application time. In your scenario, it would be better to use external IDPs. Technically speaking it doesn’t matter. Use that proxy to integrate with all your services. REMS for the cloud thing on your picture. One of the sources could be a REMS like system. You quickly go to the rems system and access as one.

Marco: We will have problems in this cloud because it will be important to not provide access to some people. Nationality is not provided by any IDP.

Niels: VISA has national original centres throughout Europe? It wouldn’t be completely unfeasible. If you are a Dutch scientist you would have to go through an ISA process. Can we assume that in the local domain the problem is non-existent?

David: if they have ultimate trust in each other, yes

Niels: you don’t need trust in the services but in the proxy. That would be sufficient. We would have to do that anyway because its issued by proxy.

Mischa: Let’s assume that you get n access token. If S2 gets it from S1 that it’s a valid access token.

Niels: Either the model is you have multiple web interfaces and you can use regular SAML or you have a model that is binding. Then it’s a none scenario. The only web scenario is the entry point. That was then created by S2

Mischa: no its crated by S1, could you pass it on?

Niels: The only way this could work is if they trusted themselves ultimtately.

Marco: According to the fact that these services expose a web interface for the user and then have a cant, then they can interact with each other on the backend.

David: If they trust each other you could do service to service relations. That is also trivial. Without tokens, just passing information. Mutual confirmation could work.

Niels: What is the value ad for having S1 talking to the backend of S3?

Marco: The point could be to obtain a certain product that is offered by s1 the basic product would need to be processed by many other ways and this processing can happen in other services. Already there is out of a process scene. Workflow coordination, engine etc so that starting from the base data things happen from the services so that the final product is provided to the user.

Niels: So why is this being handled on behalf of the user?

Mischa: the central thing doesn’t know about the s2 and s3, the primary connection is to s1 and then it goes to s2 and s3.

Marco: What if a new service is included, there is a standard interface in between s1 and s2 so that s1 know that this data is available through s2. You can decide to use it or not to use it. It’s more complex to overwrite with respect to the fact that these services are somehow independent.

Niels: How does the user of s1 know, what I would want to use the thing in the S2?

Marco: I am not sure.

Niels: How can S1 dynamically decide that the thing is going to be used, someone must have instructed s1

David: I could imagine, you ordered S1 to use the biggest resolution for a picture.

Gerben: If you model the infrastructure centrally, why not have all the catalogs, why cant you figure out then?

Mischa: You would have to change your entire architecture.

Niels: It seems like you have this model where you don’t know what is s1 doing.

Mischa: You guys it ways of authenticating this kind of services, you don’t know what its doing but you know that its doing something.

Niels: You have given the credentials so apparently as I suspected you cannot just say s1 do this and fetch, s1 go and fetch and you need some credentials as well.

David: it depends in the IGD velocity and it didn’t have ultimate trust. The user gave the delegation.

Matthew: Let’s make it concrete, you have a saml authentication response coming in and then somehow magically tis getting translated to cerberos or into a database login or ssh public key you are really talking about there is a proxy solution. You are talking about either authoring a new proxy a satosa php that will do translation for you, you can do whatever you want. The delegation means that s1 s2 and s3 are acting like proxies.

David: Not necessarily.

Niels: for any definition of proxy you are correct but it’s not like this here. A proxy there would be I am the s1 certificate and because there is a job on s1 I get access to s2.

Matthew: So this turns into something that is very protocol specific.

Marco: what I want to summarize is that as I said from the technical POV we can’t input anything. For me its fine whatever solution but if this solution is not by creating something again that is not standard…

Matthew: In my take, I would write something like a plugin for SaToSa that would take incoming response to my ISP and issue an even on my salt stack event bus that would trigger an event. To have a bus that is handling the provisioning of tokens. But you will end up writing something custom.

Niels: I would go to SPs and tell them to not do this. But again, you’re telling us I want open standards and to do this in a standardized way and yet the entities on the receiving end doarent standardized.

David: you have a choice of two technologies of doing the services like oauth. t

Niels: On the bright side, it does help you engage but it will help you. The delegation is not solved.

Matthew: I would love to see an actual working ECB example.

The Flipchart - https://imgur.com/SCfgsJX