00:00:02.310 --> 00:00:20.310
Mick Coady: Just to remind everybody that this meetings will be recorded and the recordings will be made available on the on the GTT and I'll let you know when those are available.

Mick Coady: Hopefully, by tomorrow or early next week so.

4
00:00:26.670 --> 00:00:31.980
Mick Coady: thanks again everybody for joining us looks like we've got about 15 folks here with us.

7
00:00:38.460 --> 00:00:46.170
Mick Coady: full agenda, first of all start, that is, I want to talk just briefly about the gpudev queue and then as I mentioned in the reminder email yesterday Evan will be presenting problem what we think probably the bulk of the meeting will be his presentation on pure profiling here and with insight and then, just a quick, or maybe not quick round table at the end.

10
00:01:08.850 --> 00:01:09.480
Mick Coady: So.

11
00:01:10.500 --> 00:01:20.850
Mick Coady: I thought i'd take a i've been kind of keeping track on the gpu Dev queue that we mentioned introduced here a while back and.

12
00:01:21.930 --> 00:01:26.880
Mick Coady: The what I wanted to bring up or talk about is that since the.

13
00:01:27.900 --> 00:01:28.770
Mick Coady: queue was.

14
00:01:30.120 --> 00:01:34.710
Mick Coady: released and deployed we've had a total of zero jobs.

15
00:01:36.060 --> 00:01:36.810
Mick Coady: submitted to.

16
00:01:38.520 --> 00:01:39.630
Mick Coady: And so.

17
00:01:40.770 --> 00:02:00.690
Mick Coady: As opposed to like in the gpu gpu queue in that same time period of 48 over 4800 jobs been submitted and ran against it, so the question for the group is really should we continue providing the gpu cute gpu Dev Q.

18
00:02:01.770 --> 00:02:02.580
Mick Coady: Given its.

19
00:02:03.990 --> 00:02:05.070
Mick Coady: Lack of use.

20
00:02:05.400 --> 00:02:06.660
If it's still.

21
00:02:09.210 --> 00:02:23.010
Mick Coady: might be a value to the user community, then you know i'm not suggesting or promoting that we take it away, but at the same time that hasn't received a whole lot of love in that timeframe so.

22
00:02:25.200 --> 00:02:30.660
Mick Coady: Oh go ahead and open it up john I think you had your hand up first.

23
00:02:31.500 --> 00:02:37.500
John Dennis (he/him): So I think there's something wrong with your scripts because I have personally submitted about 4050 jobs to it.

24
00:02:39.270 --> 00:02:49.860
Mick Coady: Okay let's see do is john blas here with us today that's thanks for I was i'm not shocked to get that feedback john.

25
00:02:51.330 --> 00:02:55.560
Mick Coady: So we then we'll take that under advisement okay.

26
00:02:56.070 --> 00:02:56.520
John Dennis (he/him): yeah.

27
00:02:57.120 --> 00:03:00.510
Mick Coady: Did you get did you get errors, when you did tried.

28
00:03:00.960 --> 00:03:07.890
John Dennis (he/him): Know i've been i've been you know, ever since it was released i've been running in it almost exclusively so.

29
00:03:09.000 --> 00:03:09.600
John Dennis (he/him): i've had.

30
00:03:09.690 --> 00:03:11.670
John Dennis (he/him): Each of those jobs has worked.

31
00:03:13.440 --> 00:03:20.370
John Dennis (he/him): A lot of them got killed because of you know exceeded the 30 minute time limit and I found it highly useful.

32
00:03:21.510 --> 00:03:28.200
Mick Coady: Well, thanks for the feedback, as you can see from my slide I just used our Q hissed.

33
00:03:30.690 --> 00:03:39.720
Mick Coady: script that we use to monitor that so I might ask Brian if he played wii if he has.

34
00:03:40.890 --> 00:03:43.680
Mick Coady: Any ideas of what might be going on there.

35
00:03:44.160 --> 00:03:45.210
Brian Vanderwende: yeah I think so.

36
00:03:46.230 --> 00:04:00.420
Brian Vanderwende: So if I recall correctly, the gpu did Q is a routing queue that routes to a reservation the standing reservation and so cutest queries the record of the job, so you would want to search for the reservation name.

37
00:04:01.140 --> 00:04:02.400
Mick Coady: Okay okay.

38
00:04:03.420 --> 00:04:24.630
Mick Coady: Then, then i'll that's obviously a my bad on on that i'll uh i'll do that and then update everybody via email how much it's been on its usage, so I was a little surprised disappointed when I saw those numbers so.

39
00:04:26.490 --> 00:04:45.660
Irfan Elahi: make me, I thank you so much john Dennis for pointing that all and just for everyone else if you encounter or experienced something like this, please don't hesitate in real time just send an email to csc@uconn.edu that maybe we'll get a little bit more prompt.

40
00:04:47.700 --> 00:04:53.070
Irfan Elahi: You know feedback and be able to correct it a little bit more earlier yeah.

41
00:04:54.180 --> 00:05:00.090
Mick Coady: I think the the errors on my part are fun that I misinterpreted the.

42
00:05:02.100 --> 00:05:03.900
Mick Coady: Like my query so.

43
00:05:05.100 --> 00:05:11.190
Mick Coady: I think, from what john john if I can just kind of repeat what you just said, it sounds like.

44
00:05:12.570 --> 00:05:16.170
Mick Coady: You have been using it and it's been real useful for you then.

45
00:05:17.370 --> 00:05:17.880
John Dennis (he/him): Yes.

46
00:05:17.940 --> 00:05:20.910
Mick Coady: Correct okay okay good no problems, then right.

47
00:05:21.780 --> 00:05:24.030
Mick Coady: yep okay great great thanks.

48
00:05:25.350 --> 00:05:28.470
Mick Coady: Siena you you're you, you had your hand up.

49
00:05:30.420 --> 00:05:34.290
cmille73: Sorry, I was about to say pretty much what john was saying.

50
00:05:34.530 --> 00:05:39.180
cmille73: just something that started last week and been using it heavily this week, so thank you.

51
00:05:39.300 --> 00:05:43.890
Mick Coady: it's been really awesome well what What surprised me is.

52
00:05:45.150 --> 00:05:53.010
Mick Coady: So, in your comments don't come as a shock to me because I watch GP the V 100 usage.

53
00:05:54.060 --> 00:05:57.840
Mick Coady: Pretty regular regularly through the days and.

54
00:05:58.890 --> 00:06:08.010
Mick Coady: If you've been following along on the sizzle resource status page you'll see that there's rare that the V one hundreds have been.

55
00:06:09.600 --> 00:06:10.590
Mick Coady: heavily used.

56
00:06:11.760 --> 00:06:16.950
Mick Coady: In the throughout the day, so it was a little surprised to see that.

57
00:06:18.270 --> 00:06:25.260
Mick Coady: To see this, so this all this clears things up, I think, quite a bit i'm glad we had this conversation.

58
00:06:26.340 --> 00:06:27.480
Mick Coady: So thanks seen.

59
00:06:33.180 --> 00:06:34.680
Mick Coady: yeah Brian got your hand up.

60
00:06:34.980 --> 00:06:42.510
Brian Vanderwende: Oh yeah I just thought i'd give a little more context on the number, I gave there, so it does look like pretty much the.

61
00:06:43.830 --> 00:06:50.400
Brian Vanderwende: Three users who have used the queue since we launched it and it's mostly john Dennis.

62
00:06:51.780 --> 00:07:08.070
Brian Vanderwende: Siena looks like supreme and JEREMY have submitted a couple of jobs as well to my point isn't to call particular people out, but it is a pretty small group of folks who are leveraging it so it might be a good idea for us to try to.

63
00:07:09.450 --> 00:07:14.610
Brian Vanderwende: You know advertise it more broadly again just to see if we can catch any more people.

64
00:07:15.000 --> 00:07:16.920
Brian Vanderwende: And I guess I would be curious from this.

65
00:07:16.980 --> 00:07:21.060
Brian Vanderwende: audience, especially with people who have been using it if they feel like the.

66
00:07:22.440 --> 00:07:29.520
Brian Vanderwende: The limits and the capabilities of the queue feel like they're on target for what you're doing or, if you have any suggested adjustments.

67
00:07:32.130 --> 00:07:33.480
Mick Coady: Good scene.

68
00:07:34.770 --> 00:07:35.100
cmille73: yeah.

69
00:07:36.330 --> 00:07:38.490
cmille73: I did remember one of the things that.

70
00:07:39.720 --> 00:07:44.280
cmille73: was a small hurdle to start using it, I remembered that it existed.

71
00:07:45.330 --> 00:07:50.250
cmille73: And I couldn't quite find the documentation for it on the sizzle website.

72
00:07:50.970 --> 00:07:52.290
cmille73: I ended up just doing a.

73
00:07:52.890 --> 00:08:01.650
cmille73: Q Stat to see all the names of the queues maybe if we include the documentation for it on the website, or maybe I was looking in the wrong place.

74
00:08:02.940 --> 00:08:13.650
Mick Coady: um well the if you had trouble finding it that's reason enough for us to to look at how its organized and what's out there, so.

75
00:08:15.090 --> 00:08:16.710
Mick Coady: We will get that fixed up.

76
00:08:18.630 --> 00:08:22.830
cmille73: But otherwise I was Okay, with the time limit looks like supreme has a comment on that.

77
00:08:26.460 --> 00:08:35.220
Supreeth Madapur Suresh: yeah the time, let us little shock for me so that's why i've been using the normal Q and i've been getting the GPS pretty regular basis, I don't have any complaints.

78
00:08:36.240 --> 00:08:36.570
Mick Coady: Okay.

79
00:08:38.010 --> 00:08:40.350
Mick Coady: Thanks yeah you had your hand up.

80
00:08:41.970 --> 00:08:50.340
Davide Del Vento: yeah, I just wanted to share my experience which has been you know I just use exec casper and whatever default is there.

81
00:08:51.960 --> 00:08:58.170
Davide Del Vento: For most of my work and I don't know if they use gpu Dev or not, but that works very well for me.

82
00:08:58.980 --> 00:09:00.570
Mick Coady: Okay, thanks.

83
00:09:00.840 --> 00:09:08.370
Davide Del Vento: I mean I as from a user perspective I didn't even pay attention what Q amusing and and it just works fine.

84
00:09:08.640 --> 00:09:15.660
Mick Coady: Well, you would have had to submit directly to the G gpu Dev Q, I think, in order to take advantage of it so.

85
00:09:16.470 --> 00:09:29.490
Davide Del Vento: Okay well i'm i'm specifying that I want to be one on right now, maybe I want according to what i'm doing but I don't I don't care about it Q, I mean I I mean just because it goes, I mean if we're sitting next to forever, then I would care.

86
00:09:31.200 --> 00:09:32.700
Mick Coady: One of us would have heard about it right.

87
00:09:34.140 --> 00:09:34.470
Davide Del Vento: yeah.

88
00:09:34.710 --> 00:09:39.990
Mick Coady: Okay, thanks yeah let's see like DJ had your hand up.

89
00:09:43.050 --> 00:09:51.030
David John Gagne: yeah, it is a couple comments such questions, one I think was in my group that they're probably I don't think there's been a lot of awareness of it.

90
00:09:52.440 --> 00:10:03.060
David John Gagne: that's nice I could promote for people, I think, from my understanding of Muslim groups like how they are running stuff that tends to be a lot of single gpu jobs.

91
00:10:05.190 --> 00:10:13.140
David John Gagne: Especially like our hyper parameter optimization stuff we tend to like launch a bunch of single gpu jobs and they'll kind of fills in the.

92
00:10:15.600 --> 00:10:27.150
David John Gagne: river their need is and since doing a like a big multi GP job usually sits in the queue a lot longer that that's more distributed strategies work least gotten things through faster.

93
00:10:29.070 --> 00:10:29.250
But.

94
00:10:30.510 --> 00:10:47.040
David John Gagne: When one potential question I guess option with this is if someone like doesn't say request gpu Dev QA directly but say they submit a job it's less than 30 minutes, is there a way to have it route to gpu Dev automatically.

95
00:10:49.050 --> 00:10:51.300
David John Gagne: If they request gpu gpu but then.

96
00:10:52.740 --> 00:11:01.680
David John Gagne: Say less than 30 minutes we'll go could go to gpu Dev the second one is how many gpus can your request for gpu def job.

97
00:11:02.970 --> 00:11:07.350
David John Gagne: As or as a key only did it before or How does that.

98
00:11:10.740 --> 00:11:11.010
David John Gagne: yeah.

99
00:11:11.400 --> 00:11:20.310
Mick Coady: Good questions is the limit is for that, because that's we were currently reserving one of the four ways.

100
00:11:21.480 --> 00:11:26.370
Mick Coady: On on casper for that through through you know working hours.

101
00:11:27.510 --> 00:11:33.480
Mick Coady: your other question actually was rattling around in the back of my mind too is if someone.

102
00:11:34.380 --> 00:11:50.490
Mick Coady: submits a job to gpu gpu queue for less than 30 minutes would it get automatically routed to the Dev queue and i'm going to ask my colleagues to help me out on that one because i'm not i'm not sure yeah.

103
00:11:50.880 --> 00:12:01.980
Brian Vanderwende: Next, so the answer is no, and that's that's intentional purely because we, so the the gpu def Q was designed to be a work hours Q.

104
00:12:02.910 --> 00:12:04.350
Brian Vanderwende: We didn't want a situation where.

105
00:12:04.410 --> 00:12:10.800
Brian Vanderwende: Somebody submits a job at 450 it gets routed into gpu Dev and then it stopped there until like 8am the next day.

106
00:12:10.980 --> 00:12:13.650
Mick Coady: Now that's right yeah okay thanks Bernie.

107
00:12:14.760 --> 00:12:26.670
Brian Vanderwende: But I mean there could be other ways to design the gpu depth it's more seamless it's just that's really the That was the way to have exclusive resources pointed to it versus say, giving it like a lot of priority and just hoping that it works out.

108
00:12:27.120 --> 00:12:28.020
Mick Coady: yeah yeah.

109
00:12:28.590 --> 00:12:32.070
Mick Coady: yeah so DJ wheat I do remember, now we had.

110
00:12:33.960 --> 00:12:43.560
Mick Coady: Quite a lengthy, you know debate, if you would within htc about how to structure that Q and we came down on what.

111
00:12:44.610 --> 00:12:48.480
Mick Coady: decided to try, starting with what Brian just described.

112
00:12:50.130 --> 00:12:51.840
David John Gagne: Okay, that that sounds reasonable I.

113
00:12:52.860 --> 00:12:55.140
David John Gagne: think it does address the issue like.

114
00:12:56.520 --> 00:13:05.160
David John Gagne: In the afternoon people want to do something with the gpu but it's not the things take a while to get spun up at that point so.

115
00:13:05.340 --> 00:13:12.120
David John Gagne: I always try to advertise advertise it to my group bit more and see see if I can get some people to try it out.

116
00:13:13.020 --> 00:13:17.460
Mick Coady: yeah and to brian's point we'll we'll have another round of.

117
00:13:18.750 --> 00:13:32.670
Mick Coady: communications to the user broader user community on this too right and to seen this point sounds like we need to make it easier what documentation, we do have on an easier to find.

118
00:13:34.440 --> 00:13:51.270
Mick Coady: Okay, so and so and DJ and everyone else if you find that you're having any issues, please don't hesitate to reach out to me or to see SG like earth on set and we'll try and address that, as you know, as quickly as properly, as we can.

119
00:13:52.710 --> 00:13:53.580
Okay, thanks.

120
00:13:55.020 --> 00:13:56.430
Mick Coady: The other day, your hand up again.

121
00:13:59.640 --> 00:14:02.610
Davide Del Vento: No, sorry I forgot to raise it okay okay.

122
00:14:03.420 --> 00:14:06.060
Mick Coady: john Dennis you had your hand up and see.

123
00:14:06.060 --> 00:14:15.210
John Dennis (he/him): Well, I, this is not the pylon but I figured out how to use it by by reading john glasses presentation.

124
00:14:16.470 --> 00:14:19.290
Mick Coady: Okay, so piling on yeah.

125
00:14:19.470 --> 00:14:21.480
John Dennis (he/him): yeah it's anyway.

126
00:14:21.990 --> 00:14:23.070
Mick Coady: yeah no thanks so.

127
00:14:25.050 --> 00:14:25.500
Mick Coady: here's.

128
00:14:27.360 --> 00:14:31.140
Mick Coady: it's clearly something we a whole, we need to fill.

129
00:14:35.190 --> 00:14:45.240
Mick Coady: Okay, good i'll be honest, this this good much better conversation and discussion than I was expecting so or would have anticipated so thanks for that.

130
00:14:46.830 --> 00:15:06.480
Mick Coady: Okay, well, I think we can eliminate the my question here, but should we continue providing the queue it sounds like it is, but we just need to advertise it better so and get that out word out so i'm going to turn it over to Evan now Evan is part of the.

131
00:15:07.530 --> 00:15:21.600
Mick Coady: team in a member of tdd and works closely with supreme and Siena and has been active in the Miriam port so Evan with that i'm going to stop sharing and turn it over to you.

132
00:15:23.730 --> 00:15:24.210
Evan MacBride: alright.

133
00:15:25.950 --> 00:15:26.280
Evan MacBride: Great.

134
00:15:27.390 --> 00:15:30.060
Evan MacBride: me just share my screen here.

135
00:15:41.550 --> 00:15:42.570
Evan MacBride: Can everyone see that.

136
00:15:47.430 --> 00:15:47.790
Brian Dobbins: yep.

137
00:15:48.330 --> 00:15:48.660
Evan MacBride: All right.

138
00:15:52.140 --> 00:16:03.330
Evan MacBride: Okay, so my name is Evan mcbride new student assistant with stp ASAP and today i'll be presenting profiling Miriam with and say it a.

139
00:16:03.930 --> 00:16:15.930
Evan MacBride: guide to inside systems and it's like compute, so this is based on a written profiling guide that we've tested it out with some students from the University of Delaware.

140
00:16:16.680 --> 00:16:28.440
Evan MacBride: And they were able to use that to profile with insights successfully So hopefully that guide and this presentation will be useful to others as well.

141
00:16:29.940 --> 00:16:49.740
Evan MacBride: So first some background on the inside profilers and say to split into several programs, and today we'll look at and systems and compute and systems is a system wide performance analysis tool that gives high level analysis of the entire Program.

142
00:16:51.060 --> 00:17:06.240
Evan MacBride: Notably, provides timeline views that will let you see what's happening when across the program and say compute, on the other hand, is a gpu Colonel profiler that gives detailed analysis on individual kernels.

143
00:17:07.560 --> 00:17:22.140
Evan MacBride: Other differences and say systems creates cutie read files and say compute creates nc U dot dash REP files both programs have their own command line interfaces and separate duis.

144
00:17:24.060 --> 00:17:35.670
Evan MacBride: So it is possible to query individual metrics with the insight command line interface, however, in general, it's easier to generate a full profile on casper.

145
00:17:36.270 --> 00:17:49.080
Evan MacBride: And then view that profile in the July on your local system so to do this first, you will need to download and install the insight you eyes on your local system.

146
00:17:49.770 --> 00:18:11.490
Evan MacBride: Then we'll do a multi phase profiling process where we record profiling data and casper than we download that profile to a local system with the tools such as Sep then was us and then say July on a local system to view a neatly organized profile.

147
00:18:13.350 --> 00:18:20.040
Evan MacBride: So today we'll be talking about profiling your on your end, which is the here solar physics modeling code.

148
00:18:21.150 --> 00:18:29.760
Evan MacBride: And here's some version and system information related to how the example profiles were generated for this presentation.

149
00:18:31.410 --> 00:18:41.820
Evan MacBride: So when we're profiling, a code for the first time it's probably a good idea to start with insights systems, since that will give us a high level overview.

150
00:18:42.360 --> 00:19:03.660
Evan MacBride: That will help us figure out where to go next nothing special needs to be done while building your m to profile within say it include your usual optimization flags, remember, though, to not include any debugging flags, because that will mess up some things without insight needs to work.

151
00:19:05.850 --> 00:19:18.780
Evan MacBride: So here is an example and this command for me to break down the syntax here we're calling your end it very insists in profile mode.

152
00:19:19.530 --> 00:19:45.540
Evan MacBride: we're saying we want to upload a file name and your m SS Prof cutie REP and we're requesting the open ACC kuda and mpi traces the default values for trace our kuda opengl and be tax and all srt for runtime other traces are available as well.

153
00:19:47.910 --> 00:19:49.110
Evan MacBride: Alright, so.

154
00:19:51.600 --> 00:19:54.060
Evan MacBride: Just can mean once again here.

155
00:19:55.080 --> 00:20:10.920
Evan MacBride: Alright, so I submitted a profiling job to casper I generated a cutie REP file and then I downloaded that profile to my laptop and on this slide we are seeing a screenshot from my insight systems July.

156
00:20:12.090 --> 00:20:19.500
Evan MacBride: So in the timeline view here we can collapse or reveal different levels of hierarchy.

157
00:20:20.400 --> 00:20:35.640
Evan MacBride: From processes down to kernels and memory actually sees and that's controlled by this left sidebar here so here, I have my timeline setup to show the two mpi ranks that I ran.

158
00:20:36.270 --> 00:20:54.900
Evan MacBride: I can see those so we can zoom in horizontally either using trackpad or a mouse following the instructions here and we can also expand elements vertically, to make them easier to read by clicking on the magnifying glass icon on the top right.

159
00:20:59.340 --> 00:21:14.580
Evan MacBride: And this is a zoomed in portion of the timeline view within a time step showing hdd and D to H memory transfers highlighted here in green.

160
00:21:15.720 --> 00:21:24.150
Evan MacBride: And we're also seeing some open SEC activity down here on this line so we're seeing.

161
00:21:27.240 --> 00:21:47.490
Evan MacBride: memory transfer activity within a timestamp and we're also seeing some gaps in our open ACC computation that's tied to this mpi all gather so from this, we can turn that we.

162
00:21:49.020 --> 00:21:58.200
Evan MacBride: are having some possibly unnecessary memory transfers ideally we'd like to have as few of these within a time step and.

163
00:21:59.250 --> 00:22:15.150
Evan MacBride: throughout the course of our program as possible, and we also still apparently need to transfer some computation from the cpu to the gpu we'd like to have as much done as possible on the gpu and we'd like to.

164
00:22:16.380 --> 00:22:20.010
Evan MacBride: not be transferring data back and forth as much as possible.

165
00:22:22.140 --> 00:22:29.460
Evan MacBride: So here we have a even more zoomed in portion of the timeline view this time showing some de de de.

166
00:22:30.570 --> 00:22:36.660
Evan MacBride: memory transfers in stream 25 here, I believe.

167
00:22:39.210 --> 00:22:41.640
Evan MacBride: Right this stream here.

168
00:22:42.810 --> 00:22:45.240
Evan MacBride: So, since we're seeing dd.

169
00:22:46.380 --> 00:22:50.220
Evan MacBride: That is telling us that gpu direct communication is active.

170
00:22:51.420 --> 00:22:58.260
Evan MacBride: If this were a multi node run the profile would show p2p or peer to peer transfers.

171
00:23:00.000 --> 00:23:08.910
Evan MacBride: can see more and more detail in this view a gap and computation again in the open ACC line.

172
00:23:12.420 --> 00:23:16.710
Evan MacBride: So i've read corresponding to this mpi whale.

173
00:23:18.990 --> 00:23:27.510
Evan MacBride: yeah and again we want to avoid these gaps wherever we can so if we want to know more about.

174
00:23:30.360 --> 00:23:49.950
Evan MacBride: A dark blue kernel region on the screen, we can hover over it, to show a tool tip that will give us some limited information it's not gonna work now, because this is just a screenshot, but when you're in the July that that is an option, and when we have a.

175
00:23:51.540 --> 00:23:55.260
Evan MacBride: stream expanded, we can also rate click on.

176
00:23:56.670 --> 00:23:59.310
Evan MacBride: Different kernels and we can get a.

177
00:24:00.540 --> 00:24:15.810
Evan MacBride: pop up that will tell us a basic command for insight compute that we can enter to get further information and thank you to Daniel for giving us that tip something I wasn't aware of until today.

178
00:24:17.100 --> 00:24:18.720
Evan MacBride: So that will give us a.

179
00:24:20.070 --> 00:24:27.750
Evan MacBride: bit of a transition to talking about in the compute about getting more detailed information, but first I will mention.

180
00:24:28.410 --> 00:24:39.630
Evan MacBride: The different reports that insight systems can generate So in addition to the July, we can use the insights seelye to generate small concise reports.

181
00:24:40.080 --> 00:25:05.220
Evan MacBride: That are generated from our qt REP file profiles So these are useful if we don't want to come through the the timeline or the listing of different events we just want a nice summary of what is going on in our program So here we have the API gpu some support in my command here.

182
00:25:06.450 --> 00:25:12.570
Evan MacBride: Just breaking down the syntax a little bit will enter into stats for.

183
00:25:13.980 --> 00:25:23.160
Evan MacBride: Statistics note, I guess, to get they get one of these reports, we tell me which report we'd like and then we pass in an even number of.

184
00:25:24.180 --> 00:25:30.720
Evan MacBride: format flags so other formats are tsp Jason and their few others as well.

185
00:25:34.440 --> 00:25:51.630
Evan MacBride: Right, so this is showing us pretty high level overview of the percent time that different operations and kernels are taking, as well as absolute time and nanoseconds number of instances and other information like that.

186
00:25:52.710 --> 00:26:13.560
Evan MacBride: So if we are curious about what to do next, and our profiling one method we could use is to iterate from top to bottom over these kernels and examine them further in and say compute to get some more more detailed information, and that is what we'll do next.

187
00:26:15.960 --> 00:26:16.950
Mick Coady: Is a question for you.

188
00:26:16.980 --> 00:26:17.730
Evan MacBride: Okay, great.

189
00:26:21.030 --> 00:26:23.010
John Dennis (he/him): I was gonna wait until the end asked it.

190
00:26:23.160 --> 00:26:30.270
John Dennis (he/him): Sorry i'm sorry so but i'll ask it now sorry, so can we go back a slide.

191
00:26:30.480 --> 00:26:30.870
sure.

192
00:26:34.080 --> 00:26:46.230
John Dennis (he/him): hey I know I know on a group meeting we looked at this list of operations and then you've got really confused way Colonel launches stream synchronized was so high, and.

193
00:26:48.060 --> 00:26:53.970
John Dennis (he/him): Do we have any best practices on how not to be fooled by that are we still sorting it out.

194
00:26:54.330 --> 00:27:10.050
Evan MacBride: I would still like more guidance about making sure that those yeah could launch kernel and stream synchronize aren't being duplicated, and that is something i'm still not very clear on.

195
00:27:11.400 --> 00:27:15.090
Evan MacBride: I think it was ragu who said that that that might be.

196
00:27:16.410 --> 00:27:18.270
Evan MacBride: Being summed across.

197
00:27:20.040 --> 00:27:22.560
Evan MacBride: Different streams or something like that, but.

198
00:27:23.730 --> 00:27:27.360
Evan MacBride: yeah that that is something that I am unclear.

199
00:27:29.010 --> 00:27:32.070
Evan MacBride: And I don't know predict the future or seen a.

200
00:27:33.270 --> 00:27:35.670
Evan MacBride: Have a more definite answer for that.

201
00:27:43.020 --> 00:27:43.860
Supreeth Madapur Suresh: yeah I mean.

202
00:27:45.150 --> 00:27:56.160
Supreeth Madapur Suresh: Yes, I do have many more things to add on here, so the Colonel launches and stream synchronize if you go back and look at the goi of the profile that we've got.

203
00:27:56.820 --> 00:28:01.290
Supreeth Madapur Suresh: The, especially the coup de stream synchronize it expands or.

204
00:28:02.160 --> 00:28:12.870
Supreeth Madapur Suresh: from beginning to end of the Colonel so that that actually includes the execution of the Colonel so when we get a view arik csv like Evan was showing in the next slide.

205
00:28:13.740 --> 00:28:25.950
Supreeth Madapur Suresh: We do want to take about four times a week but there's a way to minimize it, but we do want to take that out of the sport and then look at just the canoe whilst and then.

206
00:28:26.820 --> 00:28:41.880
Supreeth Madapur Suresh: By using a synchronous or multiple stream, you can use the time for stone synchronized, but it does show up, which includes the Colonel execution time as well, we could discuss more on this offline okay.

207
00:28:42.600 --> 00:28:47.610
Evan MacBride: All right, great that clears that up a little bit better, for me, so thank you.

208
00:28:50.130 --> 00:28:52.890
Evan MacBride: Alright So here we will.

209
00:28:54.060 --> 00:28:55.920
Evan MacBride: go into insight compute.

210
00:28:57.240 --> 00:29:09.900
Evan MacBride: So i'm choosing to start my insight compute profiling, by looking at the hd residual kernels in mirror m, which are towards the top of that API gpu some report.

211
00:29:10.950 --> 00:29:19.800
Evan MacBride: To these hurdles are among the most time consuming in the program in my command here i'm requesting.

212
00:29:21.030 --> 00:29:23.550
Evan MacBride: All processes, including child processes.

213
00:29:25.980 --> 00:29:38.250
Evan MacBride: i'm requesting the full set of metrics with set full and this will give us lots of information which hopefully means will only need to profile this version of the code once.

214
00:29:39.450 --> 00:29:43.680
Evan MacBride: I think it's helpful to to always include set full.

215
00:29:45.090 --> 00:29:53.250
Evan MacBride: Just so you don't find yourself going back and requesting a difference that I think it's fine to just have everything, where you need it all at once.

216
00:29:55.860 --> 00:30:06.060
Evan MacBride: And i'm also requesting to output to a file named image D rose Prof that into rap and.

217
00:30:07.620 --> 00:30:18.300
Evan MacBride: passing into Colonel ID this this slide here i'm specifying that I want a specific set of kernels which.

218
00:30:19.650 --> 00:30:30.030
Evan MacBride: include the sub string of hd rez and that will give me all of the kernels in the image, the residual function and, finally, this.

219
00:30:31.140 --> 00:30:37.200
Evan MacBride: Five here is saying that I only want the fifth invocation of each of these.

220
00:30:38.340 --> 00:30:49.410
Evan MacBride: One more thing, if you wish to force an override of files with this Community rest profit name, you can include this dash F or force over a true flag.

221
00:30:51.030 --> 00:30:55.620
Evan MacBride: And alright So here we have.

222
00:30:56.700 --> 00:31:03.480
Evan MacBride: My site compute profile, this is open up here in the July.

223
00:31:05.010 --> 00:31:26.520
Evan MacBride: The first section that you'll see is the so called speed of light, and that is comparing our kernels resource utilization to a percentage of the theoretical maximum for our device, and that is just the terminology that video chose to use.

224
00:31:28.320 --> 00:31:32.280
Evan MacBride: yeah, I think, because the idea being that you can't get faster than the speed of light.

225
00:31:33.840 --> 00:31:51.930
Evan MacBride: And he received that the sm and memory utilization for this kernel are both pretty low compared to the theoretical Max for device, we see that sm and mentor memory utilization yeah both hello, if we want to.

226
00:31:53.790 --> 00:31:58.770
Evan MacBride: get more information, we can keep scrolling I didn't leave the snowden.

227
00:32:01.080 --> 00:32:04.290
Evan MacBride: separated asked about calculating the cga.

228
00:32:05.550 --> 00:32:16.530
Evan MacBride: which unfortunately isn't super straightforward in and say compute at least as far as i've seen so far, but it is possible, you just need to.

229
00:32:18.120 --> 00:32:21.780
Evan MacBride: Read profile and request separate set of metrics.

230
00:32:23.460 --> 00:32:33.840
Evan MacBride: But you can get information about whether we are memory bound or compute bound by looking at the roofline analysis which will be in just a few slides.

231
00:32:36.210 --> 00:32:55.200
Evan MacBride: or in the next section will look at is the occupancy and achieved occupancy for this kernel is pretty low it's only 12.32% the theoretical Max for.

232
00:32:55.980 --> 00:33:09.900
Evan MacBride: Where kota setup is only 12.5% so low occupancy means there's not enough eligible warps to hide the latency between dependent instructions and it's causing our performance to suffer.

233
00:33:10.800 --> 00:33:30.330
Evan MacBride: We can also see here that we are using 170 registers and and say compute is showing us the effects on theoretical locking occupancy from varying our use of different resources and we can see for how we have the code setup now.

234
00:33:31.500 --> 00:33:52.860
Evan MacBride: varying register count per thread if we were to reduce register countdown to 32, we can see a substantial improvement in or I can see up into the 60% range but varying block size or shared memory usage at the moment won't really have much of an effect.

235
00:33:54.270 --> 00:34:04.710
Evan MacBride: So we have a lot of room to improve and reducing the number of registers, so an idea to do that is to reduce the number of private variables that this kernel uses.

236
00:34:05.400 --> 00:34:20.220
Evan MacBride: This requiring fewer registers and the way to do that would be by splitting up the Colonel into small kernels and reducing the number of private variables for each of these smaller hurdles.

237
00:34:21.780 --> 00:34:35.610
Evan MacBride: and, hopefully, that would alleviate this register pressure issue, however, we will have to keep an eye on if the increase in overhead from extra Colonel lunches if that would.

238
00:34:36.480 --> 00:34:42.330
Evan MacBride: override any benefit, we would get so we will have to see as as we're working on that.

239
00:34:42.870 --> 00:34:59.670
Evan MacBride: And then i'll know that after we do this work of splitting up the colonel, we will want to go back and re profile to see if occupancy can be further improved at that point by either changing block size or varying the shared memory usage.

240
00:35:02.070 --> 00:35:05.640
Evan MacBride: All right now we'll come to our reflection.

241
00:35:06.810 --> 00:35:07.350
Evan MacBride: plot.

242
00:35:08.820 --> 00:35:23.400
Evan MacBride: and say computer automatically generates these refined analyses, for us, the dots here are representing in this red dot higher up, that is our double precision achieve value.

243
00:35:24.300 --> 00:35:44.340
Evan MacBride: The blue.is our lower or sorry or lower blue.is our single precision achieve value so in a roof plan plot if we're under the slope here that is telling us that we are bound in our performance by our memory bandwidth whereas if we are.

244
00:35:45.810 --> 00:35:55.650
Evan MacBride: Under this flat line of the flat part of the roof line we are compute bound so here we're clearly very much memory bound.

245
00:35:56.730 --> 00:36:00.360
Evan MacBride: And to increase our performance in.

246
00:36:01.410 --> 00:36:12.900
Evan MacBride: floating point operations per second, we would need to increase our arithmetic intensity to get under this flat portion of the of the plot.

247
00:36:14.310 --> 00:36:15.210
Evan MacBride: and

248
00:36:16.350 --> 00:36:17.700
Evan MacBride: yeah he received that.

249
00:36:18.840 --> 00:36:25.260
Evan MacBride: and say computers kind enough to give us some recommendations which sum up what we're looking at.

250
00:36:27.120 --> 00:36:32.910
Evan MacBride: And that's that's a useful little tool and most of the sections that will see and say compute.

251
00:36:34.830 --> 00:36:52.020
Evan MacBride: All right, last section will look at is under the source counter section, we have some information under recommendations, giving us these alerts that we have numerous on coalesced global memory accessories.

252
00:36:53.070 --> 00:37:01.500
Evan MacBride: So that is something we want to look at perhaps changing or our memory access patterns to.

253
00:37:03.120 --> 00:37:12.030
Evan MacBride: improve that situation and hopefully increase performance something to note here that I think is pretty handy if you click on this address.

254
00:37:13.170 --> 00:37:19.650
Evan MacBride: It will take you to the CSS source page where you can see the exact.

255
00:37:20.820 --> 00:37:24.450
Evan MacBride: instruction, that is, is causing the uncoerced.

256
00:37:25.500 --> 00:37:26.940
Evan MacBride: Never access and.

257
00:37:28.800 --> 00:37:33.750
Evan MacBride: I think, Daniel did make a note of this, that there is a way to tie that to.

258
00:37:35.610 --> 00:37:39.300
Evan MacBride: lines in your actual C code.

259
00:37:40.680 --> 00:37:57.030
Evan MacBride: Not something i've been able to get working at that is something I want to come back to, but I believe you are able to annotate this CSS code, so that you can see exactly where in your original code this instruction would lie.

260
00:37:58.920 --> 00:38:08.580
Evan MacBride: Alright So here we have some references and resources that I drawn to create my written guide and this.

261
00:38:09.870 --> 00:38:32.220
Evan MacBride: Presentation as well this link here is that that written guide i'd also like to thank you all for your attention to like to thank Daniel again for all his help give lots of very useful comments and very glad for that and that is all I had and.

262
00:38:33.300 --> 00:38:38.910
Evan MacBride: hopefully be able to answer a few questions just going to.

263
00:38:39.930 --> 00:38:42.900
Evan MacBride: Stop sharing now.

264
00:38:45.120 --> 00:38:45.570
alright.

265
00:38:47.760 --> 00:38:53.610
Evan MacBride: So i'll look at the chat and see if I can.

266
00:38:54.870 --> 00:38:55.920
Evan MacBride: I can answer anything.

267
00:38:55.980 --> 00:39:02.940
Mick Coady: there's been a lively conversation discussion going on over there in the chat but then probably now's the time for.

268
00:39:04.320 --> 00:39:07.050
Mick Coady: Asked Kevin direct questions and anybody.

269
00:39:09.060 --> 00:39:09.960
Mick Coady: Thank you.

270
00:39:22.920 --> 00:39:23.430
Mick Coady: New one.

271
00:39:29.310 --> 00:39:37.500
Daniel Howard (CSG, NCAR): I guess, one of the questions you could try addressing from the chat i'm not sure you know, not like a there's talking about.

272
00:39:39.210 --> 00:39:52.680
Daniel Howard (CSG, NCAR): The validation issues in terms of like running one of the cardinals inside of loop versus outside the loop i'm just curious if you have any thoughts in terms of yourself and maybe like why that validation issue was coming up in terms of your your use of the model.

273
00:39:54.930 --> 00:39:57.240
Evan MacBride: I really could not say.

274
00:40:02.340 --> 00:40:10.440
Evan MacBride: And I see Siena here is mentioning P cast, which is what I would go to to try to hunt that down.

275
00:40:11.760 --> 00:40:19.440
Evan MacBride: But I unfortunately i'm not very familiar with, with the issue that we're talking about there.

276
00:40:20.670 --> 00:40:22.500
Daniel Howard (CSG, NCAR): Right now, specific numbers.

277
00:40:35.340 --> 00:40:36.150
Mick Coady: anybody else.

278
00:40:40.920 --> 00:40:45.510
Daniel Howard (CSG, NCAR): I guess, I have a question then i'm like maybe directed to what you're talking about actually.

279
00:40:46.800 --> 00:40:49.230
Daniel Howard (CSG, NCAR): When you talk about the unconscious memory access.

280
00:40:50.490 --> 00:41:00.900
Daniel Howard (CSG, NCAR): It were you able to ever met any changes there in terms of what you saw to actually like fixed resolve some of those unlike what did you see that then actually had an effect that address.

281
00:41:01.530 --> 00:41:10.620
Evan MacBride: I have not worked on that something that I just sold it and say was tanya so I thought it would make mention of it.

282
00:41:11.790 --> 00:41:18.810
Evan MacBride: has another piece of information that the profiler is giving us, but no I haven't had the chance to work on that yet, unfortunately.

283
00:41:22.560 --> 00:41:33.510
Daniel Howard (CSG, NCAR): I see this morning myself off and then like you know I try to look at it further than like not sure where exactly what needs to be done to fix those so be interesting, at some point, if someone ever finds out how to.

284
00:41:34.620 --> 00:41:37.110
Daniel Howard (CSG, NCAR): Actually, like consistently address some of those warnings.

285
00:41:39.780 --> 00:41:41.670
Mick Coady: This would be for sure.

286
00:41:44.280 --> 00:41:46.350
Mick Coady: john Dennis get your hands.

287
00:41:47.340 --> 00:41:59.940
John Dennis (he/him): Well, I have a number of questions but i'll ask the first one here, so it looks like there's a lot of technical information in the chat are we keeping any of the chat information or is that just going to be gone once I.

288
00:42:01.050 --> 00:42:01.590
John Dennis (he/him): hang up.

289
00:42:01.980 --> 00:42:10.590
Mick Coady: Good question john yeah it's being recorded and i'm, including those in the wiki the meeting notes and.

290
00:42:12.420 --> 00:42:12.750
Mick Coady: You know.

291
00:42:13.830 --> 00:42:16.200
Mick Coady: In in in the wiki pages for that.

292
00:42:16.260 --> 00:42:17.910
John Dennis (he/him): Okay, thank you yeah.

293
00:42:17.940 --> 00:42:21.450
Mick Coady: No, I saw that and i'm thinking well i'm glad that's in there today.

294
00:42:21.780 --> 00:42:25.350
Mick Coady: yeah yeah okay it and you said you had some more questions.

295
00:42:27.390 --> 00:42:28.140
John Dennis (he/him): I.

296
00:42:29.820 --> 00:42:32.550
John Dennis (he/him): Let me wait another five minutes or 10 minutes.

297
00:42:33.750 --> 00:42:37.830
Mick Coady: Okay okay senior you heard your hand up briefly.

298
00:42:39.300 --> 00:42:39.540
cmille73: Oh.

299
00:42:40.800 --> 00:42:50.160
cmille73: Yes, on the technical information in the chat you know I think we also have some presentations we've done on Miriam.

300
00:42:51.570 --> 00:42:56.790
cmille73: Discussing that specific issue, we were discussing in the chat where exactly the.

301
00:42:57.870 --> 00:43:04.560
cmille73: best place for the Colonel lunch should be so maybe we could include that meeting down.

302
00:43:06.690 --> 00:43:14.160
Mick Coady: Okay, good senior if you find that i'd be glad to include that it's okay with you guys, who in along with the other.

303
00:43:15.240 --> 00:43:19.620
Mick Coady: recordings and slides in the within the wiki.

304
00:43:24.090 --> 00:43:24.390
Mick Coady: Okay.

305
00:43:24.630 --> 00:43:26.550
Mick Coady: yeah thanks, thank you.

306
00:43:34.290 --> 00:43:34.800
Okay.

307
00:43:36.000 --> 00:43:39.510
Mick Coady: john, I think, the floor is yours, if you want it if you're ready.

308
00:43:40.020 --> 00:43:43.020
John Dennis (he/him): yeah Okay, I guess, so so.

309
00:43:45.000 --> 00:43:55.560
John Dennis (he/him): i'm gonna i'm in kind of a highly technical meeting with nvidia earlier in the week and and we were talking to them and they said they really like to kind of.

310
00:43:56.010 --> 00:44:13.050
John Dennis (he/him): sit on in on these to kind of get the pulse of what's going on and why you know what and cars in the communities thoughts is on gpus and you know stuff like that, so one of the ideas was well, maybe we want to invite them to GT.

311
00:44:13.740 --> 00:44:20.130
John Dennis (he/him): Because that would get a broader broader swath of their community and I just wanted to see what people's thoughts are on that.

312
00:44:24.750 --> 00:44:34.080
Mick Coady: Well, you know it near the end of last month's meeting JEREMY had pointed out that the the attendance was pretty heavily sizzle.

313
00:44:35.190 --> 00:44:52.380
Mick Coady: base, although you know we've got different groups represented here, so I think that would be a fine idea of of expanding on that, but i'd open that up to the group for their because this isn't that certainly not just my decision to make.

314
00:44:54.570 --> 00:44:56.100
Mick Coady: Are you are you in favor of it, Joe.

315
00:44:56.640 --> 00:45:00.750
John Dennis (he/him): I am in favor of it, I mean you know we can invite them in and.

316
00:45:01.890 --> 00:45:08.490
John Dennis (he/him): You know, they can choose to come or not, and you know, and that was that was another thing that I was honestly concerned about.

317
00:45:08.940 --> 00:45:12.300
John Dennis (he/him): Is you know, I was looking at the participant list and there's.

318
00:45:14.730 --> 00:45:17.040
John Dennis (he/him): Was there like three people that are not in sizzle.

319
00:45:18.240 --> 00:45:18.870
Mick Coady: reel I think.

320
00:45:18.930 --> 00:45:21.060
John Dennis (he/him): It won't be for a mate maybe for.

321
00:45:21.450 --> 00:45:22.380
Mick Coady: yeah I.

322
00:45:24.090 --> 00:45:29.700
Mick Coady: I tend to look at this to that the folks like you sherry and.

323
00:45:30.990 --> 00:45:32.310
Mick Coady: Others aren't.

324
00:45:33.990 --> 00:45:35.490
Mick Coady: Although you're in sizzle you're not.

325
00:45:35.610 --> 00:45:41.340
Mick Coady: Really part you're not part of HP CD right now and I realized that we've got a lot of.

326
00:45:43.770 --> 00:45:46.650
Mick Coady: hpcb is heavily represented here.

327
00:45:46.830 --> 00:45:49.020
Mick Coady: Which is, which is fine, but.

328
00:45:51.420 --> 00:45:52.380
John Dennis (he/him): But me.

329
00:45:53.460 --> 00:45:57.570
Mick Coady: as being more of the user Community rather than the sizzle.

330
00:45:58.590 --> 00:45:59.100
Mick Coady: side.

331
00:46:00.510 --> 00:46:00.930
John Dennis (he/him): yeah.

332
00:46:01.320 --> 00:46:06.060
Mick Coady: i'm not be i'm not trying to be defensive here either I just might that's just my observation.

333
00:46:06.210 --> 00:46:13.800
John Dennis (he/him): No i'm not trying to be critical i'm just wondering how we can get a broader participation.

334
00:46:14.280 --> 00:46:23.220
Mick Coady: yeah This is fine, I i'm more than open through ready for to hear suggestions on that.

335
00:46:23.700 --> 00:46:24.090
yeah.

336
00:46:25.530 --> 00:46:33.540
Daniel Howard (CSG, NCAR): As a quick interjection I did just fade in and video meeting actually this morning as well, in terms of like the hackathon phantom type stuff so.

337
00:46:34.140 --> 00:46:45.690
Daniel Howard (CSG, NCAR): there's these that conduit going back and forth, you know, so to speak, and some of their meetings as john what he said to himself doing earlier, and so I think, be fine just to have it go the other direction, as as it would make sense.

338
00:46:47.340 --> 00:46:51.360
Mick Coady: Maybe we do go about this from the.

339
00:46:52.650 --> 00:46:59.040
Mick Coady: Participants here today would there are there any objections or concerns about opening this up to.

340
00:47:00.120 --> 00:47:01.740
Mick Coady: You know, some some folks from nvidia.

341
00:47:05.280 --> 00:47:05.640
shiquan: I thought.

342
00:47:06.870 --> 00:47:07.500
shiquan: good idea.

343
00:47:11.970 --> 00:47:14.130
Mick Coady: yeah The other day I didn't quite catch what you said.

344
00:47:16.650 --> 00:47:22.890
Davide Del Vento: I said I don't have any objection per se, I think it could be a good idea, but I.

345
00:47:24.480 --> 00:47:27.450
Davide Del Vento: In my view, and maybe I missed something.

346
00:47:29.100 --> 00:47:39.630
Davide Del Vento: You know this looks like user group meeting, rather than tiger team so so I don't know I mean if this is a user Group then that's fine.

347
00:47:40.440 --> 00:47:50.100
Davide Del Vento: I think it's actually very beneficial to have an idea if it's an actual tiger team, which means that we need to have some strategic discussion.

348
00:47:50.460 --> 00:48:05.790
Davide Del Vento: about what to do and then main window for anchor procurement and for what we want to do for our futures, I think that would not but uh, so I think it goes by.

349
00:48:07.680 --> 00:48:14.760
Davide Del Vento: You know, are we up to the name, are we act what we're doing that's that's my two cents.

350
00:48:15.360 --> 00:48:15.720
Okay.

351
00:48:18.420 --> 00:48:22.800
John Dennis (he/him): yeah I you know I don't know exactly what a tiger team knows.

352
00:48:26.100 --> 00:48:28.710
John Dennis (he/him): But feeling more like a user community.

353
00:48:28.890 --> 00:48:29.670
slash.

354
00:48:30.960 --> 00:48:33.330
John Dennis (he/him): slash community of practice or something like that.

355
00:48:33.510 --> 00:48:35.580
Mick Coady: yeah yeah I agree.

356
00:48:35.820 --> 00:48:55.470
Davide Del Vento: I think tiger team tends to mean and bad I mean everybody can use the term as they want like a group of people, they need to solve one particular problem or one making one decision, and they want to make it aggressively and quickly, you know, like a tiger.

357
00:48:56.580 --> 00:49:04.500
Davide Del Vento: That That was my interpretation of all the tiger teams that we started to pop up at anchor in various fields.

358
00:49:05.640 --> 00:49:07.290
John Dennis (he/him): Should we change our name officially.

359
00:49:09.480 --> 00:49:12.480
Mick Coady: i'd need permission from earth on and on guys thanks for that.

360
00:49:14.370 --> 00:49:35.430
Mick Coady: But I do see that just more kind of behind the scenes that CS G has a particular role to play, that as far as getting the Community ready for your ratio and and the future that probably a little bit different or at least it complements with these meetings and what this group is.

361
00:49:37.020 --> 00:49:39.150
Mick Coady: What I see this how i've been trying to.

362
00:49:39.360 --> 00:49:42.810
Mick Coady: structure these meetings and this group so.

363
00:49:44.070 --> 00:49:51.000
Irfan Elahi: And also john the plan was to disband this and Mars test into an hug at some time.

364
00:49:54.330 --> 00:49:54.870
John Dennis (he/him): yeah.

365
00:49:56.490 --> 00:50:06.780
Irfan Elahi: And I think you know as soon as we get the delivery and open is D I don't think we'll have this meeting a separate move as part of can have unless you guys have some.

366
00:50:07.890 --> 00:50:13.860
Irfan Elahi: Different suggestion definitely i'm sorry I was stepped away is fine miss a lot of your conversation.

367
00:50:15.330 --> 00:50:21.840
John Dennis (he/him): Well, you know, I was OK, so I found it interesting and i'm sorry I jumped the queue.

368
00:50:24.360 --> 00:50:27.870
Mick Coady: You got the floor supreme your saprissa boss, you can make it more.

369
00:50:30.330 --> 00:50:40.020
John Dennis (he/him): Well, you know Okay, so we were having a ton of a really technical discussion here and there was a lot of things going on in the in the chat and.

370
00:50:40.560 --> 00:50:56.790
John Dennis (he/him): You know a lot of stuff going on in the chat it's not clear to me that you know the broader community is going to want to talk about you know, Colonel launch overheads and and you know synchronization percentages and stuff like that.

371
00:50:59.280 --> 00:50:59.850
John Dennis (he/him): Right.

372
00:51:00.180 --> 00:51:02.580
Irfan Elahi: yeah I agree with that that's a good point.

373
00:51:04.860 --> 00:51:13.290
Brian Vanderwende: john do you mean when you say broader community do you mean if more people are brought into this venue or are you more concerned about the idea of falling into and hug.

374
00:51:13.740 --> 00:51:17.940
John Dennis (he/him): Well, we folded into n How are people really going to care about their their.

375
00:51:18.000 --> 00:51:19.110
John Dennis (he/him): Their current launches.

376
00:51:20.790 --> 00:51:23.820
Irfan Elahi: yeah and I think when this gets folded.

377
00:51:23.820 --> 00:51:25.380
Irfan Elahi: into enhance which said this.

378
00:51:25.410 --> 00:51:46.680
Irfan Elahi: kind of cheering I think it's up to sit, to make sure that the conversation is cleared in the right way, there will always be some folks who have want to deep dive and have more, but it is, I think, up to the Chair to bring the conversation back to the majority of the attendees benefit.

379
00:51:49.590 --> 00:51:49.860
John Dennis (he/him): yeah.

380
00:51:49.890 --> 00:51:51.300
Sidd: hey yeah.

381
00:51:52.380 --> 00:51:52.680
Sidd: You.

382
00:51:53.790 --> 00:51:55.230
Sidd: Also, even with the john.

383
00:51:56.400 --> 00:51:58.530
Sidd: ma i'm just tossing this idea.

384
00:51:59.730 --> 00:52:00.960
Sidd: Like your feedback here.

385
00:52:02.610 --> 00:52:20.580
Sidd: Just like not not the general user group has different special interest group policies, so in my view, GT GT is part of an hard, but can we view it as a special interest group and.

386
00:52:21.300 --> 00:52:30.660
Sidd: Also, conduct a separate more technically oriented user group type meeting, where we will only discuss duties that.

387
00:52:31.980 --> 00:52:33.030
Sidd: When being.

388
00:52:35.190 --> 00:52:46.410
Irfan Elahi: yeah if you're talking about having something like breakout group, you know, after two or three and have meetings you decide hey, we need to have a GT GT focused only meeting.

389
00:52:46.950 --> 00:53:00.720
Irfan Elahi: And you can schedule another meeting in between two and have meetings with select participants who want to discuss a specific topic, I see no reason why you can do that I don't think we need to put hard.

390
00:53:01.710 --> 00:53:14.250
Irfan Elahi: boundaries and requirements, it should basically be able to evolve, based on the needs if there's a need to have a more deep dive technical, then, absolutely our.

391
00:53:15.180 --> 00:53:26.940
Irfan Elahi: schedule another talk for just those people, but I do agree with john Dennis that if the end hug is more user oriented and more a user group.

392
00:53:27.330 --> 00:53:36.390
Irfan Elahi: We need to then maybe in those meetings, try to temper down deep dive Colonel and operating system type of discussion.

393
00:53:36.930 --> 00:53:58.290
Irfan Elahi: But those should not be forgotten forgotten, they can be discussed, as you pointed out city you're not in a separate different meeting definitely I don't see any problem with that does anyone else have any suggestion or feedback, these are really good feedbacks because we need to evolve.

394
00:53:59.790 --> 00:54:03.330
Irfan Elahi: Through these discussions and and feedbacks.

395
00:54:06.090 --> 00:54:17.370
Mick Coady: yeah totally agree or from fact that was really one of the last things I wanted to talk about today we've got we've had some people with their hands up and I want to make sure we get to them, but I was going to solicit.

396
00:54:18.600 --> 00:54:36.240
Mick Coady: and remind people the everyone that were wide open for suggestions about topics in the agenda items to for these meetings, so please feel free to reach out to me or you know anyone in CS G about about this or fun so.

397
00:54:37.440 --> 00:54:47.160
Irfan Elahi: It runs once you get the feedback let's let's discuss to see how we can you know I I personally do like your idea but let's get more feedback.

398
00:54:47.700 --> 00:54:48.570
Mick Coady: Sure sure.

399
00:54:49.710 --> 00:54:51.480
Mick Coady: So I you.

400
00:54:52.620 --> 00:55:02.160
Mick Coady: A few minutes go JEREMY you had your hand up and surprised that you had your hand up for quite a while, or do we still do you still have the want to interject anything.

401
00:55:03.930 --> 00:55:04.110
Mick Coady: So.

402
00:55:05.130 --> 00:55:05.640
Supreeth Madapur Suresh: I yeah.

403
00:55:06.750 --> 00:55:13.080
Supreeth Madapur Suresh: I just wanted to make a couple of points third one was we're entering a lot of questions in the chat so.

404
00:55:14.310 --> 00:55:29.280
Supreeth Madapur Suresh: I don't know if we missed any of the questions, so please feel free to reach out to us if you have a question, so that we may might have missed, there was a lot in chat The second thing was, I think, for this meeting.

405
00:55:30.450 --> 00:55:37.110
Supreeth Madapur Suresh: There, there was the concept of community of practice and we did all we all did few short presentations.

406
00:55:38.490 --> 00:55:48.630
Supreeth Madapur Suresh: In turn off my GP GP or tiger team, it could be gpu community of practice where we discuss in details all these issues and the techniques that we learn about.

407
00:55:50.490 --> 00:55:51.240
Mick Coady: yeah good.

408
00:55:52.560 --> 00:55:54.270
Mick Coady: interesting idea thanks.

409
00:55:55.800 --> 00:56:02.430
Mick Coady: Big and that could be it's kind of we're kind of evolving into that anyway so maybe the name could just reflect that right.

410
00:56:05.880 --> 00:56:10.920
Mick Coady: Okay, Jeremy you had your hand up briefly the have something.

411
00:56:11.970 --> 00:56:22.980
Jeremy Sauer: yeah make it, I thought i'd say something and then decided didn't it wasn't necessary, but it was really just based in the notion that yeah like look as long as this soul is the primary.

412
00:56:23.940 --> 00:56:34.890
Jeremy Sauer: group of people doing gpu things and coming to GT GT meetings or in have meetings and talking about gpu things or or at least listening to gpu things.

413
00:56:36.060 --> 00:56:44.340
Jeremy Sauer: You know it's just what encourages right now apparently there aren't that many folks doing gpu stuff outside of sizzle right like it's just.

414
00:56:44.820 --> 00:56:52.260
Jeremy Sauer: So to me that's the key right like, if you want to be somehow different I mean fine, we can always have these conversations like when when you.

415
00:56:52.650 --> 00:56:57.600
Jeremy Sauer: When you break down in your profiler says that you're spending 25% of your time in overhead.

416
00:56:58.290 --> 00:57:13.050
Jeremy Sauer: costs from launching Colonel they go launch fewer kernels but right, so we can always have those conversations, but I guess, this is all about trying to help in car use gpu is more because gpus are fundamentally the modern hbc right so.

417
00:57:14.340 --> 00:57:25.710
Jeremy Sauer: We need the rest of the institution to to want to do it to have their budgets aligned with with with pushing to gpus so that they can spend their time and effort pursuing it.

418
00:57:26.280 --> 00:57:43.440
Jeremy Sauer: And then I think we'll have more dynamic back and forth between the various folks who would attend meetings like these, so my two cents on a soapbox about it takes more of a Community to be a community of practice okay yeah.

419
00:57:44.220 --> 00:58:06.900
Mick Coady: And this is also you know I think there's a an obligation or onus on on our side, the mind, in particular, is to make these meetings and in the group itself more relevant, so if that's missing then man i'd certainly appreciate an open welcome feedback from from you guys.

420
00:58:11.490 --> 00:58:20.460
Mick Coady: All right, well we're right up against the hour, so I bet many of you have other meetings to go to so I want to thank everyone for.

421
00:58:21.240 --> 00:58:32.670
Mick Coady: Very good presentation today and everybody for your participation, the the chat that was going on, through through the meeting was particularly interesting in that.

422
00:58:33.840 --> 00:58:40.110
Mick Coady: i'll try and make a point to capture that manually, just in case zoom recording happens to.

423
00:58:41.400 --> 00:58:43.530
Irfan Elahi: You I sent you a copy of the chat.

424
00:58:43.560 --> 00:58:44.310
Irfan Elahi: To you, and said.

425
00:58:44.730 --> 00:58:45.870
Mick Coady: Okay, thanks.

426
00:58:47.340 --> 00:59:10.500
Mick Coady: appreciate them okay well with that i'll let everybody go thank everybody for their participation and showing up today and look for now an email from me, probably by early next week when the recordings and slides are all ready for you on the on the wiki so take care, and thanks again.

427
00:59:11.940 --> 00:59:14.520
Irfan Elahi: yeah, thank you for the feedback everyone appreciate it.

428
00:59:14.820 --> 00:59:15.840
Mick Coady: yeah Thank you very much.

429
00:59:16.050 --> 00:59:17.190
Brian Dobbins: Next, make an earthen.

430
00:59:18.390 --> 00:59:18.990
Evan MacBride: Thanks bye.

431
00:59:19.410 --> 00:59:20.280
Davide Del Vento: Thanks bye bye.


  • No labels