MDG plan for a tracking package

Make it an opt-in (NOT opt-out) package that you have to meteor add separately and everything is fine. Some, like us, just don’t want to send you any metrics in the first place. MDG ought to tread very carefully here, this is ground where a lot of companies have been burnt.

Looking at the code, do not add it to 1.3 as it is. Others already mentioned the problems storing IP addresses, and all enterprise people will balk at another thing to whitelist.

Thinking about it pragmatically, if it’s an opt-in (or opt-out, either way - opt-out is just more annoying!) package, all serious devs will leave it out, so I feel your metrics will be limited to the countless of numbers of test apps people do locally before shedding autopublish, insecure and other extra fluff that the people that do things the “right way” from the beginning never use in the first place. Hence I think this one needs to go back to the drawing board.

A few alternate ideas on gathering feedback for your use:

  1. Offer free Galaxy hosting time for doing developer questionnaires. Especially with free meteor.com hosting going down this would be a no brainer for you to get serious statistics and information on a large scale, and get developers to try out Galaxy even if it would be on a 512M instance for a month or two.

  2. Hire developer community outreach people who are proactive in contacting people who do anything serious with Meteor from all around the world. Build your contact list internally and treat it (and the people who are in contact with the developers) as gold. Conduct quarterly research questionnaires and such on these few hundred key people. Have some of these people be in an invite-only IRC/Slack/Telegram/Whatever chat where you discuss new features first with the people who are using your product the most in places that you do not have presence in before presenting a more public draft. In this case, a draft is not an intrusive PR to core.

  3. If the forum and GitHub aren’t sufficient enough, instead of building an internal tool, ask for the community to help you build something that works for both.

4 Likes

I very much applaud that you consider the “silent majority”, but as they are not active you cannot make any sound assumptions about their intentions, wants or needs - you are only dressing your own thoughts as those of a group of people. The vocal ones should be the ones driving the development - it’s not really that hard to register on a forum or GitHub. I doubt all the metrics you’d be getting from the census package as it is now would be much more efficient without major time investment on analytics on your part.

1 Like

At this moment, I think of all the times I had to tell people that they really should remove the Autopublish package and no, it’s not “b-b-but it’s a core package built in by default so it is meant to be there or something will break (and it does break when I remove this package)”, even if your messages about the need to remove this package are very clear.

6 Likes

@zoltan

First off, I want to say I think we put the cart before the horse here by writing the code and opening the PR before engaging the community first. We’re an engineering driven company and sometimes code is easier to write than prose. We won’t make that mistake again.

Thank you dearly for recognizing this. I think this has been becoming a “theme” lately and usually is the main grounds for the disappointment and occasional turmoil.

@zoltan

If 80% of apps remain in the shadows we haven’t really achieved anything.

I don’t agree with this. Statistics is an interesting science. You’ll be amazed to see what you can achieve with the data you’ll get from 0.0001% of the apps.

@tmeasday

I kind of doubt that people would be OK with the world knowing the maxSessions of their app as a general rule. Then again, perhaps those people wouldn’t be OK with MDG knowing it either?

This may be debatable in either case, but I am sure I can get some of my clients to share their data both for you to see and publish publicly, while for others, I’m more than sure that even hinting at the possibility of such data disclosure to outside their company network would get me crucified. So that’s why I whole heartedly agree with:

@juho

Make it an opt-in (NOT opt-out) package that you have to meteor add separately and everything is fine.

and let you know that I will opt-in for some of my apps and that

@fvg

What about something like a .meteor/census-config.json file to set, which data are allowed to be sent to MDG?

is an even better suggestion because a blanket-rule about opting-in and out is just too broad. For people like me, who would like to actually help the platform grow and improve, it would be a nice way to contribute back to you and the community.

And @brajt comment about how scary it may sound for a lot of newcomers to remove a default package is spot on.

@rozzzly

Damn. Thats ridiculous. I’m all for privacy, but that protection is just silly. “oh no I only can see 3/4 of the ip… there’s only 256 other possibilities…” are you kidding me?

Come on! Laws are there for a reason and while you are more than welcome to break them, as long as you are ready to face the consequences. But then, the better you would just respect them for what they are, general consensus by a community of people who have decided that it is important for them to lay down some rules about the issue. By your reasoning, murder should not be illegal because it is perfectly easy for anyone to grab a knife and stab someone random.

3 Likes

Ok, here’s a more optimistic comment (but I do stand by the one above):

This gets me thinking - and I know limited resources, priorities, free stuff are hard to maintain yadi yada - what if MDG offered a free app-analytics package, nothing too fancy, just some basic functionality, in exchange for the rights for MDG to collectively analyze that data for their own development plans?

Most of us do trust @arunoda with our data on Kadira, don’t we? Why not put the same trust in MDG?

And when open sourced (both or either one of the data or analytics code), the community could contribute to do all kinds of interesting stuff that could benefit everyone.

3 Likes

Cool trick with the expandables!

I know you viewed this from a technical side, but this is about handing over identifiable information to 3rd parties without a propper consent. And - as, I guess, illustrated by you - the ignorance that about 1,5 times the population of the US value privacy much much higher than the average American. Took a while for Google to learn that too.

Injecting a tracking package into an open source distribution that sends data to a business (not a foundation or NGO), is exactly the kind of move that makes it impossible for me to suggest meteor as technology while fencing off questions regarding the reliability of the vendor.

Meteor is a glorified build tool. Imagine webpack would send tracking data to webpack Inc. You guys need to learn and practice old fashion market research. With only a 20% gab, the statistical confidence interval should be way big enough to draw conclusions.

Than expecting to track enterprise usage? This is just the kind of example why companies fear that their engineers even experiment with this kind of technology and accidently send data of some secret project over the wire. A great excuse to block the adoption of Meteor.

@zoltan Use statistics and market research validation of your server telemetry. Tracking of a web hosting company - on the client side - has no place in open source technology that requires trust and more trust from a broad range of stakeholders of which some also look for reasons not to allow the use of the meteor open source project.

Understand this Enterprise Example:

Imagine I sit together with the CTO and his senior dev to evaluate some technology stack. I talk about meteor, an open source project - so no risk of a dying vendor or copyright material. Than they ask me “How do they make money?”. I say “They try to build a infrastructure hosting company around this. But the software is and stays open source.”.
Than they want to see it. I go on my, or sometimes, the CTO machine and download meteor, create a new project in front of their eyes and run it.

I know exactly what will happen next
A) No tracking message
The CTO plays around for the afternoon and while he reads into the docs, he figures out he has been tracked. In some case even violating his own foreign infrastructure policies because the code sends out over the firewall.

B) A notice is shown, that tracking can be removed with "remove consensus"
The CTO will ask immediately: “You said it’s independent. What else do they track and how do we remove it? Is this a “closed” open source project? We can not have anything tracking our users/clients/project infrastructure without knowing exactly what is send around.” The demo ends.

C) Some message for opt in tracking by MDG
CTO: “We can not allow any tracking of our infrastructure. Please choose NO. Is that the only part we need to be worried about data leaks? What is the role of this MDG really and is it likely they just turn around on technology aspects if they can inject their own tracking package into an open source project?”

Data is the currency of the 21 century. If you want to be taken seriously, value the Data of others as if you are about to pick pocket and think about the trust relationship you will have with that someone/victim later.

On this, it really decides if meteor is really an open source project with no immediate profit goals or a marketing tool of MDG. The later destroys more than any value you could create in analyzing all the data points of meteor apps.

12 Likes

I may be wrong but I doubt that creating a great roadmap is a difficult problem that couldn’t be solved without a built-in tracking package. Meteor already has an amazing roadmap (Apollo, multi-database support, NPM support, etc.) and I think MDG is very aware of the state of the JS ecosystem and the strengths and weaknesses of the Meteor platform in this ecosystem. My intuition is that this package will be used to get some traction numbers to the attention of MDG investors—which is all fine and good, but may explain why you would like to track the largest number of apps as possible, and not only a statistically exploitable portion of it.

Anyway, I still think that asking the user its explicit consent is a very important requirement of this feature. Continuing on @rozzzly’s idea of “making a (y/n) opt-in when using the CLI generator”, I’d like to suggest an amendment. @zoltan underlined that putting the prompt on app creation is creating too much friction (“pause, think and press a key everytime you create an app”, and it also breaks existing scripts relying on meteor create).

Luckily it seems that MDG isn’t interested in tracking local projects that were never deployed anywhere (correct me if I’m wrong). So why not ask for the user explicit consent on the first meteor deploy or meteor build? The first time you deploy an application, you need to think about many things anyway (Where to host the server? Galaxy? Where to host the database?) and so having one extra question about “do I accept to report usage statistics to Meter Development Group to help them?” isn’t interrupting the user flow. The prompt will include a URL to the complete documentation of the census package, and if the user answer yes the package will be added and the build will continue. The question will only be prompted the first time you deploy of build and then a notice will be added in .meteor/.finished-upgraders so that it isn’t asked again for this application.

I believe this proposal has a number of advantages:

  • it ensures that the user is aware of the tracking;
  • it encourages the user to make an educated decision. First time deployment of an application you developed is a particular experience, thus you are very likely to click on the URL that will be prompted and thus, take a look at the census package documentation;
  • it forces the user to give an explicit consent on a per-application basis;
  • it doesn’t break existing script relying on meteor create;
  • it doesn’t break existing scripts relying on meteor build or meteor deploy once the notice has been added in the internal file;
  • it doesn’t scare newbies while they are playing with toy projects—until they build or deploy.

What do you think?

3 Likes

To address one point, if you include an explicit question in the meteor build or meteor deploy command you will end up breaking a lot of people’s automated deploy scripts.

I personally vote for the package being made externally, but not being included, or even asked about it in the create process. I would like to see a notice at the end of application create process along the lines of:

Thank you for creating a new Meteor project.  We do not collect any
usage statistics from your application, however, to help us with our
market research, and to help us make Meteor better, you can install
the census package with `meteor add census`, which will send basic
statistical information to our servers.

For full information on the data census sends, visit http://.....

I also feel that any information sent back should be anonymous. If MDG do end up publishing the stats publicly, then it should only be aggregated or analysed data, not specifics. They can keep the data sent back, but Meteor is used world-wide, and needs to respect data protection laws world-wide.

3 Likes

One of the goal is indeed to avoid breaking these deploy scripts. The implementation I have in mind is:

  • if the app version is below 1.3, don’t prompt;
  • if the app version is 1.3 and above and there is the 1.3.0-census-package notice in .meteor/.finished-upgraders, don’t prompt;
  • otherwise prompt.

With that it won’t break existing apps, and will only prompt once for 1.3+ apps.

Yeah, then what about a .meteor/census-config.json config file just like @fvg suggested? census may in fact be added by default with everything turned off via config, and should a dev want to opt-in, can turn the ones they deem appropriate on by setting them to true:

{
  "ipAddress": false,
  "sessionCount": false,
  "meteorVersion": false,
  "installedPackages": false,
  "socialSecurityNumber": false,
  "creditCard": false,
  "creditCardPinNuber": false,
  "yourBiggestMostSecretSin": false
}
1 Like

I don’t feel it needs to be prompted at all, let alone in a build or deploy manner.

I think that having a prominent message when you create an application, and again in the documentation as a package you can add, and what it does that there will be enough traction from developers who wish to make those statistics available.

That said, if we follow the implementation you have in mind, I would like a way for opting out of it ever prompting me, as a particular developer etc.

HUh?!? (Sorry, off-topic, but) how do you do a collapsible content area?

I would also be fine with a no-prompt opt-in solution a-la Kadira. What I proposed was a compromise solution assuming that MDG is interested in converting as many apps as possible.

1 Like

As long as MDG is careful to make sure they aren’t putting developers at risk of crossing legal terms (e.g. like in europe… which may be easier said than done with all the changes and different countries/territories to consider) and they take the time to understand what is okay to collect and not to collect (from a general developer ethics standpoint) then I think it’s great that they can take a more data-driven approach… the upside is much greater than the downside imo.

Also, just because django and rails don’t have a data-driven approach like this doesn’t meant hey shouldn’t… this is no different (from a business perspective) than you using intercom/segment to make your app better… if anything this should drastically increase meteor’s quality over frameworks who don’t use this strategy (django and rails,etc.).

Saying the PRs are enough is like saying you are going to use a forum to get all the feedback in your app instead of segment/intercom/etc… which would be crazy because people wont’ always post and you can’t gauge if it’s jsut that one person having an issue or 10000s of users…

MDG is re-writing the development landscape in the same way apple re-wrote Microsoft’s “open-source” hardware approaches… this came with pros and cons but it’s generally agreed that apple has been able to give a better UX (to the general population) because they had more control over the entire product development process… this is just another extension of MDG taking startup/for-profit concepts and applying them to software development.

** if MDG could provide some concrete examples of how this will benefit meteor developers, it would go a long way to getting buy in

  • The name census is excellent. It has great civic undertones, and frames/brands the package as part of an opt-in polling process, and not a backdoor or tracker or big-brother system.

  • From the name alone, one would expect it to be an opt-in process.

  • In the life sciences, people are quite accustomed to things being monitored and regulated. The presence of tracking software is generally expected, and not a deal-breaker. It’s simply a question of who is authorized to track and monitor data.

  • Most healthcare environments are interested in both epidemiology and tech support. A census package could be part of a narrative around a particular type of distribution, similar to how RedHat Linux and CentOS have server farm management software. ie. Galaxy Meteor ships with census as an opt-out, but Community Meteor ships with census as opt-in.

  • Folks in the life science industries are going to want the package credentialed and verified. It will be a no-go for many folks until it gets certified and code-signed. Then it’s going to have a green light, and people will happily install it anywhere.

  • Start with the EV “green bar” SSL certificate. Any tracking communication should be sent to a registered “green bar” address. Analysts will be monitoring the inbound TCP/IP traffic at the enterprise level, and these streams will show up on their logging tools. Give them an authenticated trail to follow back to headquarters and the MDG Privacy Policy.
    https://www.digicert.com/code-signing/ev-code-signing.htm

  • Register with Computer Associates and eTrust and Apple, and whoever else is acting as the big LDAP servers nowdays.

  • In a best-case world, there would be code-signing involved. But I’m not entirely sure how that would work in the Meteor environment.

  • Lacking code signing, most healthcare/life-science enterprises are going to do a code review to make sure that the package is tracking what it claims. ie. ipAddress and meteorVersion are okay, but socialSecurityNumber is not. When that’s complete, they’re going to specify the particular version, and say “This version is approved. Don’t use any other version except this one.”

  • That makes it a bit difficult to do Agile development with a package like this. Expect to use a bit more of a Waterfall design, and do more engineering up-front.

  • The Privacy Policy will need to be updated accordingly, obviously.

3 Likes

Ok lets use a developer oriented business then. Docker has raised 6X the money that MDG has, they are open source (~3000 fewer stars than meteor on github), and I’m going to guess that they are probably more widely used than meteor yet they don’t have anything that is similar to this. I feel like I’m failing to see what everyone else does as to how this can possibly be a good thing.

How do you unintentionally include code that returns an object?

 compose() {
    return {
      properties: {
        appId: Config.appId,
        appSecret: Config.appSecret,
        rootUrl: Config.rootUrl,
        version: Meteor.release,
        maxSessions: Stats.maxSessions
      },
      context: {
        app:{
          name: Census.name,
          version: Census.version
        },
        ip: Utils.ip(),
        os: {
          name: Os.platform(),
          version: Os.release()
        }
      }
    };
  }

Especially when the code to get the IP address was also written for this package

// Gets the ip address
  ip() {
    return _.chain(Os.networkInterfaces())
      .flatten()
      .filter(iface => iface.family == 'IPv4' && !iface.internal)
      .pluck('address')
      .first()
      .value();
  },

When the person who was thinking about the requirements is separate from the person writing the code, and something was lost in the communication.

And this is the magic the wonder of open source - that you can’t accidentally sneak things in, people can take a look at this stuff before it goes in. So I’d say this is a victory of the process.

12 Likes

I know for the company I work for we are very secretive about our stack. And while I have been in contact with certain people in the MDG about our stack, we would be 100% unable to get permission to opt-in if the database was public as there is just too much risk there of competitors figuring it out.

If however the database was closed and either aggregate results or a well sanitized database was made public, we could consider being involved in that.

1 Like

That was my own mistake, I meant to remove it a few days ago and didn’t do that.
Will remove it today.
There was never a plan to track the inner ip, just the outside ip from the activity server, it’s my own misunderstanding…

2 Likes

I feel like I’m coming off more harshly than I mean to. More than anything I’m just voicing my personal concerns. With all of the TOS that I’ve agreed to over the years and the amount of personal data available about me that’s already on the internet it’s at least slightly disconcerting that these TOS/opt-out data tracking servers are following me to websites/mobile apps that I have created or helped create especially when there is no clear definition as to what the information is going to be used for.

1 Like