Trust and Internet Identity Meeting Europe
2013 - 2020: Workshops and Unconference

FIM impact assesment using reserach community SP

(Peter Gietz)

Does the service provider have the data that interests us and is the service provider willing to give us the data? Which kind of data interests us? We have quantitative data which might be easier, and qualitative which can be complicated. What kind of parameters would be interesting and how could we motivate providers to give them freely? Server provider federation: a group of service providers that choose which identity providers they need, it is a good model and federation operators could do similar things. It is always something about a policy and they could include it in the policy.

What were the reasons for blacklisting ADP?

If the id provider wants to have a contract before providing info, it is a sensitive situation. Additional contracts require additional work. Federation could maybe try to educate participants better. It is always better to have n to 1 then n to n contracts. The entity category is a sort of implicit contract. What data should we collect? We should be careful not to collect personal data. If the login comes from the social network it is long federated and if the login is directly done at the application locally, that is the worst thing and that is not federated.

We should have 4 types of login types :

  • one we want to have, really federated login,
  • login as IDP of last resort,
  • login via social
  • log in directly at the RSP site

We could use emails as good indicator where the data comes from, but a lot of researchers that switched university are just using their Gmail account. How are we going to collect all these things, are we talking about new plugins, new coding or humans going through logs? We should do it with a script that goes through the logs, we need something that could do it in real-time.

Collecting the raw data is always better, but it contains IP addresses.
If you have someone from uni idp, and they are using Gmail acc it would save Gmail, that is confusing.

For real federation login, some data could be collected on the IDP side. You want to know whether one person logs in all the time or are these different identities.

For the real federation case, it could be done by session ID which IDP has. We may want to think about how to make the federation work reliably without discriminating. The point is to expose the problems in the education part of the system.

Central points like proxy are a good place to do the counting. It is the thing we should just start and see how it is going. What should be recorded?

For the raw data, we want to have a user serial number, login domain and email contact from the user.

Table with:

  • total number of successful logins
    • federated idp,
    • federated idp of last resort
    • social idp,
    • RSP site login ,
  • total number of nonsuccessful logins
    • different reasons, such as
      • no unique id
      • no r&s
      • policy missing ( srtify)
      • SAML response error
      • network error
  • from raw data (data for every login)
    • serial ID,
    • login domain,
    • domain part of email

We will be able to show number of successes and failed logins. We collect data all the time, only do the snapshots every 30 days.